Microginin producing proteins and nucleic acids encoding a microginin gene cluster as well as methods for creating novel microginins

ABSTRACT

The invention provides for nucleic acid molecules enabling the synthesis of microginin and microginin analogues. The invention also provides for methods for identifying microginins as well creating microginins which may not be found in nature.

TECHNICAL FIELD

The present invention relates to the fields of chemistry, biology,biochemistry, molecular biology. The invention provides for novelnucleic acid molecules enabling the synthesis of microginin andmicroginin analogues. Microginin finds an application in therapeutics.The invention thus extends into the field of mammalian therapeutics anddrug development.

INTRODUCTION Cyanobacteria and Microginin

Cyanbacteria are gram-negative bacteria. Due to their ability to performphotosynthesis they were long thought to belong to the plant kingdom andwere formerly classified as blue-green algae. Cyanbacteria have adaptedto almost all ecological niches. Most of strains known up to date arefound in fresh water lakes and oceans. In the last few yearscyanobacteria have been recognised as a source for biologically activenatural compounds.

Cyanobacteria are a group of microscopic organisms somewhere “inbetween” algae and bacteria and they are found in freshwater and marineareas throughout the world. Scientifically, they are considered to bebacteria, but because they can perform photosynthesis, they also used tobe classified as “blue-green algae”.

Cyanobacterial peptides (cyanopeptides) are among the most ubiquitouslyfound potentially hazardous natural products in surface waters used byhumans. Though these substances are natural in origin, eutrophication(i.e. excessive loading with fertilising nutrients) has caused massivecyanobacterial proliferation throughout Europe. Thus, cyanopeptides nowoccur with unnatural frequency and concentration.

A large group among the diverse cyanopeptides are the oligopeptides(peptides with a molecular weight of <2KD). But while specificcyanopeptides—e.g. microcystins and nodularins—are well studied andrecognised as being causative for many animal poisonings and humanillness, a substantial and increasing body of evidence points toward adecisive role of other potentially toxic cyanopeptides in the causationof both acute and chronic human illnesses.

Freshwater and marine cyanobacteria are known to produce a variety ofbioactive compounds, among them potent hepatotoxins and neurotoxins.Many of the toxic species of cyanobacteria tend to massive proliferationin eutrophicated water bodies and thus have been the cause forconsiderable hazards for animal and human health. One of the mostwidespread bloom-forming cyanobacteria is the genus Microcystis, awell-known producer of the hepatotoxic peptide microcystin. Microcystinsare a group of closely related cyclic heptapeptides sharing the commonstructure. So far, more than 80 derivatives of microcystins have beenidentified, varying largely by the degree of methylation, peptidesequence, and toxicity.

The traditional botanical code describes the genus Microcystis as acoccal, unicellular cyanobacterium that grows as mucilaginous coloniesof irregularly arranged cells (under natural conditions, while straincultures usually grow as single cells). According to this tradition,morphological criteria such as size of the individual cells, colonymorphology, and mucilage characteristics are used for speciesdelimitation within Microcystis (i.e., morphospecies).Microcystin-producing strains as well as strains that do not synthesizemicrocystin have been reported for all species within the genusMicrocystis. However, whereas most field samples and strains ofMicrocystis aeruginosa and Microcystis viridis studied to date werefound to contain microcystins, strains of M. wesenbergii, M. novaceckii,and M. ichthyoblabe have only sporadically been reported to containmicrocystins.

Beside microcystins, various other linear and cyclic oligopeptides suchas anabaenopeptins, aeruginosins, microginins and cyanopeptolins arefound within the genus Microcystis (Namikoshi, M., and K. L. Rinehart.1996. Bioactive compounds produced by cyanobacteria. J. Ind. Microbiol.17:373-384.).

Similar to microcystins, these peptides possess unusual amino acids like3-amino-6-hydroxy-2-piperidone (Ahp) in cyanopeptolins,2-carboxy-6-hydroxyoctahydroindol (Choi) in aeruginosin-type moleculesor 3-amino-2 hydroxy-decanoic acid (Ahda) in microginins and numerousstructural variants also exist within these groups. These peptides showdiverse bioactivities, frequently protease inhibition (Namikoshi, M.,and K. L. Rinehart. 1996. Bioactive compounds produced by cyanobacteria.J. Ind. Microbiol. 17:373-384).

The occurrence of both microcystins and other oligopeptides such asanabaenopeptins, microginins and cyanopeptolins in natural Microcystispopulations was recently demonstrated. It is well known that the speciesand genotype composition in natural Microcystis populations isheterogeneous, and both microcystin- and non-microcystin-containingstrains have been isolated from the same sample. Just as strainsproducing microginin and strains not producing microginin have beenfound. These results suggest a considerable diversity of genotypes withdifferent oligopeptide patterns in natural Microcystis populations.

By typing single Microcystis colonies, it was possible in 1999 to showfor the first time that the actual peptide diversity in a naturalpopulation of this genus is extremely high. Many of the substancesdetected belong to well-known groups of cyanobacterial peptides likemicrocystins, anabaenopeptins, microginins, cyanopeptolins, andaeruginosins, of which many have been discovered in Microcystis spp. Inaddition, numerous unknown components have been detected in suchcolonies. However, the origin of these unknown components has yet to beinvestigated, since besides the observed epiphytic cyanobacteria andalgae, heterotrophic bacteria are also known to be present inMicrocystis colonies. Chemical screening of cyanobacterial samples (bothfrom field samples and from culture strains) has demonstrated a widevariety of substances: e.g. an almost monospecific bloom of Planktothrixagardhii contained as many as 255 different substances, most of whichwere oligopeptides.

Thus, it may be concluded, that the situation with respect to theassignment of the capability of microginin production to certain speciesand strains, i.e. also a true understanding of the genotypes and speciesinvolved as well as their evolution has to date, not been possible. Infact PEPCY a research project supported by the European Commissionconcluded that present information shows that one species or“morphotype” (i.e. individuals with the same morphologicalcharacteristics) may comprise a range of genotypes that encode fordifferent “chemotypes” (i.e. morphologically indistinguishableindividuals containing different cyanopeptides).

Ace Inhibitors and Microginin

ACE catalyses the conversion of angiotensin I into angiotensin II withinthe mammalian renin-angiotensin system, leading to arterial stenosis,which in turn causes an increase of blood pressure. ACE inhibitorscounteract this process and therefore play a role in human medicine asblood pressure lowering agents. Microginin is an important drugcandidate for ACE inhibition. So far only 30 structural variants ofmicroginin are known, making clinical development difficult.

Microginins are characterized by a decanoic acid derivate,3-amino-2-hydroxy-decanoic acid (Ahda) at the N-terminus and apredominance of two tyrosine units at the C-terminus. They vary inlength from 4 to 6 amino acids with the variability occurring at theC-terminal end (Microginins, zinc metalloprotease inhibitors from thecyanobacterium Microcystis aeruginosa, 2000, Tetrahedron 56:8643-8656).In the past it has only been possible by means of synthesis of3-amino-2-hydroxy-decanoic acid to chemically generate microgininvariants (J Org. Chem. 1999 Apr. 16; 64(8):2852-2859. Acylnitrene Routeto Vicinal Amino Alcohols. Application to the Synthesis of (−)-Bestatinand Analogues. Bergmeier S C, Stanchina D M.) Alternativelycyanobacterial strains were screened for microginin activity, which wastedious and time consuming. It has so far not been possible to screenfor strains efficiently due to the lack of species understanding and amethodology of efficiently distinguishing microginin producers fromnon-producers (see above). Further it was not possible to easily andefficiently alter and thus develop microginins in order to provide for avariety of lead compounds from which better ACE-inhibitors may bedeveloped.

BRIEF DESCRIPTION OF THE INVENTION

From Microcystis aeruginosa a cluster of genes, spanning about 30 kbpshas been isolated encoding a hybrid synthetase composed of non-ribosomalpeptide synthetases (NRPS), polyketide synthases (PKS) and tailoringenzyme which as the inventors show is responsible for the biosynthesisof microginin. The strain from which this nucleic acid was firstisolated by G. C. Kürzinger from Lake Pehlitz 1977].

The inventors provide for a biological system enabling not only theproduction of micoginins, the heterologous expression of microginin, butalso a system for modifying microginin and thus developing so farunknown variants of microginin. The invention further provides fornucleic acids and methods for identifying strains which have the abilityto produce microginin.

In particular the invention relates to one or more nucleic acidsencoding a microginin synthetase enzyme complex with the followingactivities: an adenylation domain (A*) wherein, the adenylation domaincomprises a peptide sequence according to SEQ ID NO. 1, an acyl carrierprotein (ACP), an elongation module (EM) of polyketide synthases (PKS)comprising the following activities: (i) ketoacylsynthase (KS), (ii)acyl transferase (AT) (iii) acyl carrier protein (ACP2), anaminotransferase (AMT), three to five elongation modules (EM) ofnon-ribosomal peptide synthetases (NRPS) comprising the followingactivities: (i) condensation domain (C), (ii) adenylation domain (A),(iii) thiolation domain (T) and a thioesterase (TE).

DETAILED DESCRIPTION OF THE INVENTION

As outlined above the invention in particular relates to one or morenucleic acids encoding a microginin synthetase enzyme complex with thefollowing activities: an adenylation domain (A*) wherein, theadenylation domain comprises a peptide sequence according to SEQ ID NO.1, an acyl carrier protein (ACP), an elongation module (EM) ofpolyketide synthases (PKS) comprising the following activities: (i)ketoacylsynthase (KS), (ii) acyl transferase (AT) (iii) acyl carrierprotein (ACP 2), an aminotransferase (AMT), three to five elongationmodules (EM) of non-ribosomal peptide synthetases (NRPS) comprising thefollowing activities: (i) condensation domain (C), (ii) adenylationdomain (A), (iii) thiolation domain (T) and a thioesterase (TE).

The inventors have found that microginin is the product of non-ribosomalsynthesis. It is important to understand that microginin as previouslyidentified in nature may also in part have been the product of ribosomalsynthesis and further processed via various enzymatic reactions.

It is important to note that the nucleic acid claimed herein, i.e. amicroginin synthetase enzyme complex may also be present in organismsother organisms than Microcystis sp., such as Nostoc, Anabaena,Plankthotrix or Oscillatoria. The term microginin shall thus not limitthe invention to such nucleic acids producing synthetase enzymecomplexes resulting in peptides officially termed “microginin”.

Herein, an adenylation domain (A*) is understood to activate octanoicacid as an acyl adenylate and an acyl carrier protein (ACP) isunderstood to bind the octanoic acid adenylate as a thioester.

An elongation module (EM) of polyketide synthases (PKS) is also knowne.g. from the Jamaicamide synthetase gene cluster isolated from Lyngbyamajuscula (Chem. Biol. Vol. 11, 2004 pp 817-833. Structure andBiosynthesis of the Jamaicamides, new mixed polyketide-peptideneurotoxin from the marine cyanobacterium Lyngbya majuscula) hereincomprises at least the following activities: (i) ketoacylsynthase (KS),(ii) acyl transferase (AT) and (iii) acyl carrier protein (ACP2). The ATis responsible for the recognition of malonyl-CoA, the KS is responsiblefor the Claisen-type-condensation of the activated octanoic acidadenylate with malonyl-CoA and the ACP2 is responsible for binding ofthe resulting decanoic acid.

An aminotransferase (AMT) performs the β-amination of the decanoic acid.

The nucleic acid according to the invention may have three to fiveelongation modules (EM) of non-ribosomal peptide synthetases (NRPS)comprising at least the following activities: (i) condensation domain(C), (ii) adenylation domain (A), (iii) thiolation domain (T). The A isresponsible for the activation of carboxyl groups of amino acids, the Tis responsible for the binding and the transport of the activatedintermediate, the C is responsible for the condensation of the activatedamino acids with the growing peptide chain.

Finally the nucleic acid according to the invention shall contain athioesterase (TE) activity which performs the cleavage of the finalproduct from the synthetase complex.

One may envision that the nucleic acid according to the invention ispresent in a vector or a bacterial chromosome, in which case one mayenvision that the portions designated above while being in one cell neednot all, be in, or on, one molecule. It is essential to the inventionhowever, that a cell meant to produce microginin synthetase enzymecomplex contains the activities designated above in order to produce anenzyme complex according to the invention which in turn may produce amicroginin. Thus, the invention also encompasses derivatives of thenucleic acid molecule as outlined above having the function of amicroginin synthetase enzyme complex.

The molecule is characterized by a special adenylation domain (A*) whichis unusual in that it is not similar to known adenlyation domains foundin other molecules encoding non-ribosomal enzyme complexes such as themicrocystin synthetase gene cluster (Chem. Biol. Vol. 7 2000, pp753-764: Structural organisation of microcystin synthesis in Microcystisaeruginosa PCC 7806: In integrated peptide-polyketide-synthetase system)Molecules encompassed herein are those which carry this adenylationdomain (A*) as depicted in SEQ ID NO. 1 and at least an ACP whereby thisACP may stem from another known non-ribosomal enzyme complex, at leastone EM of PKS whereby this EM may stem from another known non-ribosomalenzyme complex comprising at least the following activities: (i) KS,(ii) AT (iii) ACP, an AMT whereby this AMT may stem from another knownnon-ribosomal enzyme complex three to five EMs comprising at least thefollowing activities: (i) C, (ii) A, (iii) T whereby these EMs may stemfrom another known non-ribosomal enzyme complex and a TE whereby this TEmay stem from another known non-ribosomal enzyme complex. Chimeraswhereby parts of the above are on one or more vectors and or integratedin chromosomes are equally encompassed by the invention as long as allthe components are in one cell.

The invention also pertains to isolated nucleic acid molecules encodinga microginin synthetase enzyme complex comprising an adenylation domainwhich is 85% identical to SEQ ID NO. 1, more preferred 90% identical toSEQ ID NO. 1 most preferred 95% identical to SEQ ID NO. 1. Sequenceidentity herein is in percent of total sequence of the adenylationdomains when aligned with conventional nucleotide alignment software,such as the best fit and or pileup programs of the GCG package

The invention also pertains to a microginin synthetase enzyme proteincomplex with the following activities: an adenylation domain (A*)wherein, the adenylation domain comprises a peptide sequence accordingto SEQ ID NO. 1, an acyl carrier protein (ACP), an elongation module(EM) of polyketide synthases (PKS) comprising the following activities:(i) ketoacylsynthase (KS), (ii) acyl transferase (AT) (iii) acyl carrierprotein (ACP 2), an aminotransferase (AMT), three to five elongationmodules (EM) of non-ribosomal peptide synthetases (NRPS) comprising thefollowing activities: (i) condensation domain (C), (ii) adenylationdomain (A), (iii) thiolation domain (T) and a thioesterase (TE).

The invention in particular also relates to a nucleic acid moleculeencoding an adenylation domain (A*) wherein, the adenylation domaincomprises a peptide sequence according to SEQ ID NO. 1.

The invention in particular also relates to a peptide molecule, anadenylation domain (A*) wherein, the molecule comprises a peptidesequence according to SEQ ID NO. 1.

The invention in particular also relates to a nucleic acid moleculeencoding an adenylation domain (A*) wherein, the molecule comprises anucleic acid sequence according to SEQ ID NO. 25.

In a preferred embodiment of the invention the nucleic acid additionallyand optionally comprises sequences encoding the following activities ordomains: a monooxygenase (MO), an integrated N-methyltransferase domain(MT) within one or more elongation modules (EM) of NRPS, anon-integrated N-methyltrasferase (MT), a modifying activity (MA)wherein, said MA is selected from the group comprising the followingactivities: halogenase, sulfatase, glycosylase, racemase,O-methyltransferase and C-methyltransferase, two or more peptide repeatspacer sequences (SP) consisting of one or more repeats of being eitherglycine rich or proline and leucine rich, located adjacently upstreamand downstream of the MO and/or another MA.

Herein MO is an enzyme catalyzing the hydroxylation of the decanoicacid, an integrated N-methyltransferase domain (MT) within one or moreelongation modules (EM) of NRPS catalyses the methylation of the amidebond by the respective module and a non-integrated N-methyltrasferase(MT) catalyzes the methylation of an amino group of the microginin.

The term modifying enzyme stands for numerous enzymes such enzymes mayadd groups or create bonds, in a preferred embodiment MA is selectedfrom the group comprising the following activities: halogenase,sulfatase, glycosylase, racemase, O-methyltransferase andC-methyltransferase.

Nucleic acids encoding two or more peptide repeat spacer sequences (SP)consisting of one or more repeats being either glycine rich or prolineand leucine rich have astonishingly been found by the inventors to aidin integration of novel MAs into existing microginin synthetase enzymecomplexes. By means of placing such SPs adjacently to MAs the inventorsare able to create microginin synthetase enzyme complexes (MSEC)comprising activities previously not found in MSECs. This in turn allowsfor the creation of novel microginins with potentially novel therapeuticproperties. Thus the invention relates to nucleic acids encoding two ormore peptide repeat spacer sequences (SP) consisting of one or morerepeats being either glycine rich or proline and leucine rich may bepositioned adjacently to a MA such as but not limited to a halogenase, asulfatase, a glycosylase, a racemase, an O-methyltransferase or aC-methyltransferase. These SPs aid in ensuring that the “foreign”activity “works” in the enzyme complex. The inventors have found, thatthis is due to the lack of secondary structures in the SP peptidechains.

The nucleic acid according to the invention in a preferred embodimentoptionally comprises the following sequences, nucleic acid sequencesencoding protein sequences as follows:

An adenylation domain (A*) according to SEQ ID NO. 1, an acyl carrierprotein (ACP) according to SEQ ID NO. 2, an elongation module ofpolyketide synthases responsible for the activation and the condensationof malonyl-Co A: (i) ketoacylsynthase domain (KS) according to SEQ IDNO. 3, (ii) acyl transferase domain (AT) according to SEQ ID NO. 4, anacyl carrier protein domain (ACP 2) according to SEQ ID NO. 5, anaminotransferase (AMT) according to SEQ ID NO. 6, an elongation modulesof non-ribosomal peptide synthetases responsible for the activation andcondensation of alanin: (i) condensation domain (C) according to SEQ IDNO. 7, (ii) adenylation domain (A) according to SEQ ID NO. 8, (iii)thiolation domains (T) according to SEQ ID NO. 9, an elongation modulesof non-ribosomal peptide synthetases responsible for the activation andcondensation of leucin: (i) condensation domain (C 2) according to SEQID NO. 10, (ii) adenylation domain (A 2) according to SEQ ID NO. 11,(iii) thiolation domain (T 2) according to SEQ ID NO. 12, an elongationmodules of non-ribosomal peptide synthetases responsible for theactivation and condensation of tyrosine 1: (i) condensation domain (C 3)according to SEQ ID NO. 13, (ii) adenylation domain (A 3) according toSEQ ID NO. 14 (iii) thiolation domain (T 3) according to SEQ ID NO. 15,an elongation modules of non-ribosomal peptide synthetases responsiblefor the activation and condensation of tyrosine 2: (i) condensationdomain (C 4) according to SEQ ID NO. 16, (ii) adenylation domain (A 4)according to SEQ ID NO. 17, (iii) thiolation domain (T 4) according toSEQ ID NO. 18, a thioesterase (TE) according to SEQ ID NO. 19, amonooxygenase (MO) according to SEQ ID NO. 20, two or more peptiderepeat spacer sequences (SP1/SP2) according to SEQ ID NO. 21 and 22, anintegrated N-methyltransferase domain (MT) within the elongation module(EM) of the NRPS responsible for the activation and condensation ofleucin according to SEQ ID 23 and a non-integrated N-methyltrasferase(MT 2) according to SEQ ID NO. 24.

As outlined above, the minimal requirement according to the invention isa nucleic acid encoding a microginin synthetase enzyme complex with thefollowing activities: an adenylation domain (A*) wherein, theadenylation domain comprises a peptide sequence according to SEQ ID NO.1, an ACP according to SEQ ID NO. 2, an elongation module (EM) ofpolyketide synthases (PKS) comprising the following activities: (i)ketoacylsynthase (KS) according to SEQ ID NO. 3, (ii) acyl transferase(AT) according to SEQ ID NO 4, (iii) acyl carrier protein (ACP 2)according to SEQ ID NO. 5, an aminotransferase (AMT) according to SEQ IDNO. 6, three to five elongation modules (EM) of non-ribosomal peptidesynthetases (NRPS) comprising the following activities: (i) condensationdomain (C) according to SEQ ID NO. 7, (ii) adenylation domain (A)according to SEQ ID NO. 8, (iii) thiolation domain (T) according to SEQID NO. 9 and a thioesterase (TE) according to SEQ ID NO. 10. A moleculecomprising the above sequences is preferred herein.

The invention explicitly also relates to analogs hereto, additionallycomprising, e.g. other activities and/or spacer regions both transcribedand non-transcribed.

It is apparent to those skilled in the art, that amino acids may beexchanged maintaining the enzymatic activity required. Thus, theinvention also relates to molecules with sequences which are notidentical to those outlined above however, altered only in so far as theenzymatic activity desired is retained.

The nucleic acid according to the invention may contain nucleic acidsselected from the group comprising: an adenylation domain (A*) accordingto SEQ ID NO. 25, an acyl carrier protein (ACP) according to SEQ ID NO.26, an elongation module of polyketide synthases encoding for theactivation and the condensation of malonyl-Co A: (i) ketoacylsynthasedomain (KS) according to SEQ ID NO. 27, (ii) acyl transferase domain(AT) according to SEQ ID NO. 28, (iii) acyl carrier protein domain (ACP2) according to SEQ ID NO. 29, an aminotransferase (AMT) according toSEQ ID NO. 30, an elongation modules of non-ribosomal peptidesynthetases encoding for the activation and condensation of alanin: (i)condensation domain (c) according to SEQ ID NO. 31, (ii) adenylationdomain (A) according to SEQ ID NO. 32, (iii) thiolation domain (T)according to SEQ ID NO. 33, an elongation modules of non-ribosomalpeptide synthetases encoding for the activation and condensation ofleucin: (i) condensation domain (C 2) according to SEQ ID NO. 34, (ii)adenylation domain (A 2) according to SEQ ID NO. 35, (iii) thiolationdomain (T 2) according to SEQ ID NO. 36, elongation modules ofnon-ribosomal peptide synthetases encoding for the activation andcondensation of tyrosine 1: (i) condensation domains (C 3) according toSEQ ID NO. 37, (ii) adenylation domains (A 3) according to SEQ ID NO.38, (iii) thiolation domains (T 3) according to SEQ ID NO. 39,elongation modules of non-ribosomal peptide synthetases encoding for theactivation and condensation of tyrosine 2: (i) condensation domains (C4) according to SEQ ID NO. 40, (ii) adenylation domains (A 4) accordingto SEQ ID NO. 41, (iii) thiolation domains (T 4) according to SEQ ID NO.42, a thioesterase (TE) according to SEQ ID NO. 43, a monooxygenase (MO)according to SEQ ID NO. 44, two or more peptide repeat spacer sequences(SP1/2) according to SEQ ID NO. 45 and 46, an integratedN-methyltransferase domain (MT) within the elongation module (EM) of theNRPS encoding for the activation and condensation of leucin according toSEQ ID 47 and a non-integrated N-methyltrasferase (MT 2) according toSEQ ID NO. 48.

As outlined above, the minimal requirement according to the invention isa nucleic acid encoding a microginin synthetase enzyme complex with thefollowing activities: an adenylation domain (A*) wherein, theadenylation domain is a nucleic acid sequence according to SEQ ID NO.25, an ACP with a nucleic acid sequence according to SEQ ID NO. 26, anelongation module (EM) of polyketide synthases (PKS) comprising thefollowing activities: (i) ketoacylsynthase (KS) with a nucleic acidsequence according to SEQ ID NO. 27, (ii) acyl transferase (AT) with anucleic acid sequence according to SEQ ID NO 28, (iii) acyl carrierprotein (ACP 2) with a nucleic acid sequence according to SEQ ID NO. 29,an aminotransferase (AMT) with a nucleic acid sequence according to SEQID NO. 30, three to five elongation modules (EM) of non-ribosomalpeptide synthetases (NRPS) comprising the following activities: (i)condensation domain (C) with a nucleic acid sequence according to SEQ IDNO. 31, (ii) adenylation domain (A) with a nucleic acid sequenceaccording to SEQ ID NO. 32, (iii) thiolation domain (T) with a nucleicacid sequence according to SEQ ID NO. 33 and a thioesterase (TE) with anucleic acid sequence according to SEQ ID NO. 43. A molecule comprisingthe above sequences is preferred herein.

The invention also relates to nucleic acid molecules with sequenceswhich are not identical to those outlined above however, altered only inso far as the enzymatic activity desired is retained. I particular oneskilled in the art will know that positions in nucleic acid triplets may“wobble” and these positions may thus be altered with no influence onthe peptide sequence. Further multiple amino acids are encoded by morethan one DNA triplet. One skilled in the art will know that one mayalter such triplets maintaining the amino acid sequence. Thus saidsequences are equally encompassed by the invention.

The invention also pertains to isolated nucleic acid molecules encodinga microginin synthetase enzyme complex comprising an adenylation domainwhich is 85% identical to SEQ ID NO. 25, more preferred 90% identical toSEQ ID NO. 1 most preferred 95% identical to SEQ ID NO. 1. Sequenceidentity herein is in percent of total sequence of the adenylationdomains when aligned with a conventional amino acid alignment softwaresuch as the best fit and or pileup programs of the GCG package.

In a preferred embodiment the one or more nucleic acids according to theinvention are organized in sequence parts encoding the microgininsynthetase enzyme complex in an upstream to downstream manner asdepicted in FIG. 1. In a particularly preferred embodiment theactivities and domains are arranged as shown and on one molecule.

The nucleic acid molecule may be part of a vector. Such vectors are inparticular, bacterial artificial chromosomes (BAC), Cosmids or Fosmids,and Lambda vectors, Preferred plasmid vectors which are able toreplicate autonomously in cyanobacteria are derived from the pVZvectors. Preferred fosmid vectors which are able to replicateautonomously in cyanobacteria are derived from the pCC1FOS™ and pCC2FOS™vectors (Epicentre Biotechnologies). The integration of the nucleic acidaccording to the invention into the vector is a procedure known to thoseskilled in the art (Molecular Cloning: A Laboratory manual, 1989, ColdSpring Harbour Laboratory Press) or in the manuals of manufactures ofkits for creation of genomic libraries (e.g. Epicenter Biotechnologies).

In a preferred embodiment the invention concerns a microorganismtransformed with a nucleic acid according to the invention. The nucleicacid according to the invention may integrated into the chromosome ofthe host organism or may present on a separate vector (see alsoexamples). It is preferred that the phototrophic cyanobacterial hostorganism is selected for the group comprising: Synechocystis sp.,Synechococcus sp., Anabaena sp., Nostoc sp., Spirulina sp., Microcystissp. . . . Cells are cultured as follows:

Media: Bg 11 (for cultivation of cyanobacteria)Aeration: air containing 0.3-3.0% carbon dioxideLight intensity: 40-100 μE/m²*s (diameter of illuminated culture vesselsof photobioreactor d=4-12 cm)Cell density at harvest: OD_(750nm) 1-2And if the host is Microcystis aeruginosa:Light quality: Additional red light illumination with 25 μE/m²*s for24-48 hours before harvesting.

It is preferred that the heterotrophic host organism is selected for thegroup comprising: E. coli and Bacillus sp. due to a more suitable GCcontent and codon usage than other heterotrophic bacteria.

In case of using E. coli for the heterologues expression of themicroginin synthetase a phosphopanthetein transferase (Ppt) has to beco-expressed in order to enable the synthesis of microginin. Theco-expression of the Ppt from a microginin producing strain would bepreferred. Other Ppt's with a broad specificity even from heterotophicorganisms like Bacillus sp. are also suitable.

In one embodiment of the invention the invention relates to a method ofproducing a microginin, comprising culturing a cell under conditionsunder which the cell will produce microginin, wherein said cellcomprises a nucleic acid encoding a recombinant microginin, according tothe invention, and wherein said cell does not produce the microginin inthe absence of said nucleic acid.

The inventors have identified nucleic acid sequences which for the firsttime make it possible to detect nucleic acids encoding a microgininsynthetase enzyme complex. This has been extremely difficult, due to thefact that other gene clusters which encode non-ribosomal proteinproducing complexes share sequence similarity with the present clusterclaimed herein. Such primers or probes according to the invention areselected from the group of, a) nucleic acid according to SEQ ID NO. 49(Primer A), b) nucleic acid according to SEQ ID NO. 50 (Primer B), c)nucleic acid according to SEQ ID NO. 51 (Primer C), d) nucleic acidaccording to SEQ ID NO. 52 (Primer D), e) nucleic acid according to SEQID NO. 53 (Primer E), f) nucleic acid according to SEQ ID NO. 54 (PrimerF), g) nucleic acid according to SEQ ID NO. 55 (Primer G), h) nucleicacid according to SEQ ID NO. 56 (Primer H), i) nucleic acid according toSEQ ID NO. 57 (Primer I) and j) nucleic acid according to SEQ ID NO. 58(Primer J). It is known to one skilled in the art that such primers orprobes may be altered slightly and still accomplishes the task ofspecifically detecting the desired target sequence. Such alterations insequence are equally encompassed by the invention. The primers or probesaccording to the invention may be applied in hybridization reactionsand/or amplification reactions. Such reactions are known to one skilledin the art.

The invention also concerns a method for detecting a microgininsynthetase gene cluster in a sample wherein, one or more of the nucleicacids according to the invention are, applied in an amplification and/ora hybridization reaction.

In a preferred embodiment of the method according to the inventionprimers D and F or H and J or E and I or E and A are added to a PCRreaction mixture comprising a sample and wherein, presence of anamplification product represents presence of microginin synthetase genecluster and absence of an amplification product represents absence of amicroginin synthetase gene cluster. As can be seen from the examples(example 3 below), certain combinations are preferred. Samples may beisolated DNA, prokaryotic cells stemming from plates or liquid cultures.

When performing an amplification reaction with primers D and F the mostpreferred amplification conditions are as follows: a) denaturing, b) 48°C. annealing and c) elongation (product size: 675 bp). Thesetemperatures may vary a bit in the range of 2-8 degrees C.

When performing an amplification reaction with primers H and J the mostpreferred amplification conditions are as follows: a) denaturing, b) 54°C. annealing and c) elongation (product size: 1174 bp). Thesetemperatures may vary a bit in the range of 2-8 degrees C.

When performing an amplification reaction with primers E and I the mostpreferred amplification conditions are as follows: a) denaturing, b) 56°C. annealing and c) elongation (product size: 1279 bp). Thesetemperatures may vary a bit in the range of 2-8 degrees C.

When performing an amplification reaction with primers E and A the mostpreferred amplification conditions are as follows: a) denaturing, b) 57°C. annealing and c) elongation (product size: 621 bp). Thesetemperatures may vary a bit in the range of 2-8 degrees C. Molarity ismost commonly 0.2-1.0 μM for the primers. Buffers and other reagentsdepending on polymerase used.

When performing hybridisation reactions the above nucleic acids areusually labeled. Such labels may be radioactive or non-radioactive, suchas fluorescent. The nucleic acid primers or probes may be applied, e.g.for the screening of libraries.

The invention also relates to antibodies against a peptide according toSEQ ID NO. 1 (A*). The creation of such antibodies is known to oneskilled in the art. The antibodies may be polyclonal or monoclonal. Suchantibodies may be labeled or non-labeled, they may also be altered inother form, such as humanized.

The inventors have astonishingly found that newly identified peptiderepeat spacer sequences (SP) may be placed adjacently to MAs I in orderto create novel hybrid gene clusters. These SPs act by spacing the novelactivity or domain so that it is functionally active in the microgininsynthetase enzyme complex.

The invention thus, further relates to nucleic acids encoding a peptiderepeat spacer sequence (SP) wherein, the peptide sequence comprises atleast 4 glycin amino acids per single repeat unit (SRU) or, at least 5proline and/or leucin amino acids per SRU, A SRU within the SP isbetween 7 and 15 amino acids in length and, the SP comprises between 2and 10 SRUs.

The invention further relates to peptides of a peptide repeat spacersequence (SP) wherein, the peptide sequence comprises at least 4 glycinamino acids or, at least 5 proline and/or leucin amino acids, the singlerepeat unit (SRU) within the SP is between 7 and 15 amino acids inlength and, the SP comprises between 2 and 10 SRU. In a preferredembodiment of the invention the SRU is between 9 and 13 amino acids inlength in a particularly preferred embodiment the SRU is eleven aminoacids in length. In a preferred embodiment the SP comprises between 3and 9 SRU.

In a preferred embodiment the nucleic acid encoding the peptide repeatspacer sequence (SP) according to the invention, encodes a peptide SRUas shown in SEQ ID NO. 20 or SEQ ID NO. 21. In a further embodiment thepeptide repeat spacer sequence (SP) according to the invention,comprises or contains a sequence as shown in SEQ ID NO. 20 or SEQ ID NO.21. In a further embodiment the nucleic acid according to the inventionhas a sequence as laid down in SEQ ID NO. 43 or SEQ ID NO. 44.

Not only by means of the above mentioned SPs but in particular becauseof these the inventors are able to create enzyme complexes resulting inmicroginin variants which may not be found in nature. This is anessential aspect of the present invention. The invention provides for,for the first time a simple method of producing recombinant microgininvariants comprising, modifying the nucleic acid according to theinvention in vitro or in vivo, growing a recombinant cell comprisingsaid recombinantly modified nucleic acid encoding a microgininsynthetase under conditions which lead to synthesis of a microginin and,recovering the synthesized microginin.

In a preferred embodiment of said method according to the invention,said modifying of said nucleic acid may be an action selected from thegroup of one or more of the following actions: a) inactivation of one ormore of the MTs present, b) substitution of one or more of the MTspresent with a halogenase, a sulfatase, a glycosylase, a racemase, anO-methyltransferase or a C-methyltransferase, c) inactivation of the MO,d) substitution of the MO with a halogenase, a sulfatase, a glycosylase,a racemase, an O-methyltransferase or a C-methyltransferase, e)inactivation of the AMT, f) substitution of the AMT with a halogenase, asulfatase, a glycosylase, a racemase, an O-methyltransferase or aC-methyltransferase, g) inactivation of the PKS module, h) substitutionof the entire PKS module with an alternative PKS module and/orsubstitution of one or more of the domains (KS, AT, ACP) therein, i)inactivation of the A* domain, j) substitution of the A* domain withalternative A domains, k) inactivation of one or more of the NRPSmodules and 1) substitution of one or more of the NRPS modules withalternative NRPS modules and/or substitution of one or more of thedomains (C, A, T) therein.

Halogenases, sulfatases, glycosylases, racemases, O-methyltransferasesor C-methyltransferases are known from prokaryotes. These enzymes areencoded by genes of the secondary metabolism in particular NRPS/PKSsystems.

Alternative PKS-systems, entire modules as well as single domains (KS,AT, ACP) are found in cyanobacteria as well as Actinomycetes,Myxobacteria, Bacillus among the bacteria.

Alternative NRPS-systems, entire modules as well as single domains (C,A, T) are found in cyanobacteria as well as Actinomycetes, Myxobacteria,Bacillus among the bacteria.

In a preferred embodiment the above are from cyanobacteria.

It is important to note, that said inactivation and/or substitution maydone in many ways, e.g. inactivation may imply deleting the completeactivity or domain, or may imply inactivation by means of a singlenucleotide exchange.

The methods are known to those skilled in the art and comprise basicmolecular biological methods such as DNA isolation, restrictiondigestion, ligation, transformation, amplification etc.

In a preferred embodiment said alternative modules or domains which areused for substitution of the original module or domain, additionally maycomprise one or more SP nucleic acids according to the invention locatedadjacently upstream of the module or domain used for substitution andone or more SP nucleic acids according the invention located adjacentlydownstream of the module or domain used for substitution. Thus, in thisembodiment of the invention a construct is made comprising the domainwhich is to be entered into the original nucleic acid according to theinvention, further comprising one or more SPs located adjacently in anupstream and downstream manner. This construct is then ligated into theoriginal microginin synthetase encoding nucleic acid. The resultantconstruct is then brought into a host by means of transformation foreither a) integration into the host chromosome or b) with aself-replicating vector.

The polypeptides, i.e. proteins can be any of those described above butwith not more than 10 (e.g., not more than: 10, nine, eight, seven, six,five, four, three, two, or one) conservative substitutions. Conservativesubstitutions are known in the art and typically include substitutionof, e.g. one polar amino acid with another polar amino acid and oneacidic amino acid with another acidic amino acid. Accordingly,conservative substitutions preferably include substitutions within thefollowing groups of amino acids: glycine, alanine, valine, proline,isoleucine, and leucine (non polar, aliphatic side chain); aspartic acidand glutamic acid (negatively charged side chain); asparagine,glutamine, methionine, cysteine, serine and threonine (polar unchargedside chain); lysine, histidine and arginine; and phenylalanine,tryptophane and tyrosine (aromatic side chain); and lysine, arginine anhistidine (positively charged side chain). It is well known in the arthow to determine the effect of a given substitution, e.g. on pK₁ etc.All that is required of a polypeptide having one or more conservativesubstitutions is that it has at least 50% (e.g., at least: 55%; 60%;65%, 70%; 75%; 80%; 85%; 90%; 95%; 98%; 99%; 99.5%; or 100% or more) ofthe ability of the unaltered protein according to the invention.

In preferred embodiments the polynucleotides, i.e. nucleic acids of thepresent invention also comprise nucleic acid molecules which are atleast 85%, preferably 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,96%, 97%, 98%, or 99% identical to those claimed herein.

The determination of percent identity between two sequences isaccomplished using the mathematical algorithm of Karlin and Altschul(1993) Proc. Natl. Acad. Sci. USA 90: 5873-5877. Such an algorithm isincorporated into the BLASTN and BLASTP programs of Altschul et al.(1990) J. Mol. Biol. 215: 403-410. BLAST nucleotide searches areperformed with the BLASTN program, score=100, word length=12, to obtainnucleotide sequences homologous to the nucleic acids according to theinvention. BLAST protein searches are performed with the BLASTP program,score=50, wordlength=3, to obtain amino acid sequences homologous to theEPO variant polypeptide, respectively. To obtain gapped alignments forcomparative purposes, Gapped BLAST is utilized as described in Altschulet al. (1997) Nucleic Acids Res. 25: 3389-3402. When utilizing BLAST andGapped BLAST programs, the default parameters of the respective programsare used.

FIGURES

FIG. 1 depicts the structure of microginin.

FIG. 2 depicts the microginin synthetase gene cluster and thebiosynthetic pathway of microginin.

EXAMPLES Example 1 Method for Detecting Gene Clusters According to theInvention

Strains carrying a gene cluster encoding a microginin synthetase complexcan be distinguished from strains not carrying such a gene clusterperforming a PCR reaction using RedTaq ReadyMix PCR Reaction Mix withMgCl₂ (Sigma) and primer pairs and the corresponding annealingtemperatures as described in Claims 11-12. In particular the PCRconditions are as follows: an initial denaturation for 1 minutes at 95°C., followed by 30 cycles of denaturation at 95° C. for 30 seconds,elongation at said annealing temperatures for 30 seconds and extensionat 72° C. for 1 kb of product size.

Example 2 Method for Optimised Cultivation of Microginin ProducingMicrocystis spp.

Strains. Media: Bg 11 (for cultivation of cyanobacteria)Aeration: air containing 0.3-3.0% carbon dioxideLight intensity: 40-100 μE/m²*s (diameter of illuminated culture vesselsof photobioreactor d=4-12 cm)Light quality: Additional red light illumination with 25 μE/m²*s for24-48 hours before harvesting.Cell density at harvest: OD_(750nm) 1-2

Tables

TABLE 1 SEQ IDMTINYGDLQEPFNKFSTLVELLRYRASSQPERLAYIFLRDGEIEEARLTYGELDQKARAI NO. 1 A*AAYLQSLEAEGERGLLLYPPGLDFISAFFGCLYAGVVAIPAYPPRRNQNLLRLQAIIADSQARFTFTNAALFPSLKNQWAKDPELGAMEWIVTDEIDHHLREDWLEPTLEKNSLAFLQYTSGSTGTPKGVMVSHHNLLINSADLDRGWGHDQDSVMVTWLPTFHDMGLIYGVIQPLYKGFLCYMMSPASFMERPLRWLQALSDKKATHSAAPNFAYDLCVRKIPPEKRATLDLSHWCMALNGAEPVRAEVLKKFAEAFQVSGFKATALCPGYGLAEATLKVTAVSYDSPPYFYPVQANALEKNKIVGATETDTNVQTLVGCGWTTIDTQIVIVNPETLKPCSPEIVGEIWVSGSTIAQGYWGKPQETQETFQAYLADTGAGPFLRTGDLGFIKDGELFITGRLKEIILIRGRNNYPQDIELTVQNSHPALRPSCGAAFTVENKGEEKLVVVQEVERTWLRKVDIDEVKRAIRKAVVQEYDLQVYAIALIRTGSLPKTSSGKIQRRSCRAKFLEGSLEILG SEQ IDMSTEIPNDKKQPTLTKIQNWLVAYMTEMMEVDEDEIDLSVPFDEYGLDSSMAVALIADLE NO. 2DWLRRDLHRTLIYDYPTLEKLAKQVSEP ACP SEQ IDMEPIAIIGLACRFPGADNPEAFWQLMRNGVDAIADIPPERWDIERFYDPTPATAKKMYSR NO. 3QGGFLKNVDQFDPQFFRISPLEATYLDPQQRLLLEVTWEALENAAIVPETLAGSQSGVFI KSGISDVDYHRLAYQSPTNLTAYVGTGNSTSIAANRLSYLFDLRGPSLAVDTACSSSLVAVHLACQSLQSQESNLCLVGGVNLILSPETTVVFSQARMIAPDSRCKTFDARADGYVRSEGCGVVVLKRLRDAIQDGDRILAVIEGSAVNQDGLSNGLTAPNGPAQQAVIRQALANAQVKPAQISYVEAHGTGTELGDPIEVKSLKAVLGEKRSLDQTCWLGSVKTNIGHLEAAAGMAGLIKVVLCLQHQEIPPNLHFQTLNPYISLADTAFAIPTQAQPWRTKPPKSGENGVERRLAGLSSF GFGGTNSHVILSEQ ID VFLFAGQGSQYVGMGRQLYETQPIFRQTLDRCAEILRPHLDQPLLEILYPADPEAETASF NO.4 AT YLEQTAYTQPTLFAFEYALAQLWRSWGIEPAAVIGHSVGEYVAATVAGALSLEEGLTLIAKRAKLMQSLPKNGTMIAVFAAEERVKAVIEPYRTDVAIAAVNGPENFVISGKAPIIAEIIIHLTAAGIEVRPLKVSHAFHSHLLEPILDSLEQEAAAISYQPLQIPLVANLTGEVLPEGATIEARYWRNHARNPVQFYGSIQTLIEQKFSLFLEVSPKPTLSRLGQQCCPERSTTWLFSLAPPQEEEQSLLNSLAILYDSQGAE SEQ IDITLQTLVGNLLQLSPADVNVHTPFLEMGADSIVMVEAVRRIENTYNVKIAMRQLFEELST NO. 5LDALATYL ACP 2 SEQ IDKEMLYPIVAQRSQGSRIWDVDGNEYIDMTMGQGVTLFGHQPDFIMSALQSQLTEGIHLNP NO. 6RSPIVGEVAALICELTGAERACFCNSGTEAVMAAIRIARATTGRSKIALFEGSYHGHADG AMTTLFRNQIIDNQLHSFPLALGVPPSLSSDVVVLDYGSAEALNYLQTQGQDLAAVLVEPIQSGNPLLQPQQFLQSLRQITSQMGIALIFDEMITGFRSHPGGAQALFGVQADIATYGKVVAGGMPIGVIAGKAHYLDSIDGGMWRYGDKSYPGVDRTFFGGTFNQHPLAMVAARAVLTHLKEQGPGLQQQLTERTAALADTLNHYFQAEEVPIKIEQFSSFFRFALSGNLDLLFYHMVEKGIYVWEWRKHFLSTAHTEADLAQFVQAVKDSITELR SEQ IDGGDQVPLTEAQRQLWILAQLGDNGSVAYNQSVTLQLSGPLNPVAMNQAIQQISDRHEALR NO. 7 CTKINAQGDSQEILPQVEINCPILDFSLDQASAQQQAEQWLKEESEKPFDLSQGSLVRWHLLKLEPELHLLVLTAHHIISDGWSMGVILRELGELYSAKCQGVTANLKTPKQFRELIEWQSQPSQGEELKKQQAYWLATLADPPVLNLPTDKPRPALPSYQANRRSLTLDSQFTEKLKQFSRKQGCTLLMTLLSVYNILVHRLTGQDDILVGLPASGRGLLDSEGMVGYCTHFLPIRSQLA SEQ IDTYSELNCRANQLAHYLQKLGVGPEVLVGILVERSLEMIVGLLGILKAGGAYVPLDPDYPP NO. 8 AERLQFMLEDSQFFLLLTQQHLLESFAQSSETATPKIICLDSDYQIISQAKNINPENSVTTSNLAYVIYTSGSTGKPKGVMNNHVAISNKLLWVQDTYPLTTEDCILQKTPFSFDVSVWELFWPLLNGARLVFAKPNGHKDASYLVNLIQEQQVTTLHFVSSMLQLFLTEKDVEKCNSLKRVICSGEALSLELQERFFARLVCELHNLYGPTEAAIHVTFWQCQSDSNLKTVPIGRPIANIQIYILDSHLQPVPIGVIGELHIGGVGLARGYLNRPELTAEKFIANPFASLDPPLTPLDKGGDESYKTFKKGGEQPSRLYKTGDLARYLPDGKIEYLGRIDNQVKIRGFRIELGEIEAVLL SHPQVREAVVSEQ ID EAIAAIFGQVLKLEKVGIYDNFFEIGGNSLQATQVISRLRESFALELPLRRLFEQPTVAD NO.9 T LALAV SEQ IDPRDGQLPLSFAQSRLWFLYQLEGATGTYNMTGALSLSGPLQVEALKQALRTIIQRHEPLR NO. 10 C 2TSFQSVDGVPVQVINPYPVWELAMVDLTGKETEAEKLAYQESQTPFDLTNSPLLRVTLLKLQPEKHILLINMHHIISDGWSIGVFVRELSHLYRAFVAGKEPTLPILPIQYADFAVWQREWLQGKVLAAQLEYWKRQLADAPPLLELPTDRPRPAIQTFQGKTERFELDRKLTQELKALSQQSGCTLFMTLLAAFGVVLSRYSGQTDIVIGSAIANRNRQDIEGLIGFFVNTLALRLDLS SEQ IDTYGELNHRANQLAHYLQSLGVTKEQIVGVYLERSLEMAIGFLGILKAGAAYLPIDPEYPS NO. 11 A 2VRTQFILEDTQLSLLLTQAELAEKLPQTQNKIICLDRDWPEITSQPQTNLDLKIEPNNLAYCIYTSGSTGQPKGVLISHQALLNLIFWHQQAFEIGPLHKATQVAGIAFDATVWELWPYLTTGACINLVPQNILLSPTDLRDWLLNREITMSFVPTPLAEKLLSLDWPNHSCLKTLLLGGDKLHFYPAASLPFQVINNYGPTENTVVATSGLVKSSSSHHFGTPTIGRPIANVQIYLLDQNLQPVPIGVPGELHLGGAGLAQGYLNRPELTAEKFIANPFDPPLTPLDKGGEEPSKLYKTGDLARYLPDGNVEFLGRIDNQVKIRGFRIETGEIEAVLSQYFLLAESVV SEQ IDAQLTQIWSEVLGLERIGVKDNFFELGGHSLLATQVLSRINSAFGLDLSVQIMFESPTIAG NO. 12 T 2IAGYI SEQ IDARDGHLPLSFAQQRLWFLHYLSPDSRSYNTLEILQIDGNLNLTVLEQSLGELINRHEIFR NO. 13 C 3TTFPTVSGEPIQKIALPSRFQLKVDNYQDLDENEQSAKIQQVAELEAGQAFDLTVGPLIQFKLLQLSPQKSVLLLKMHHIIYDGWSFGILIRELSALYEAFLKNLANPLPALSIQYADFAVWQRQYLSGEVLDKQLNYWQEQLATVSPVLTLPTDRPRPAIQTFQGGVERFQLDQNVTQGLKKLGQDQVATLFMTLLAGFGVLLSRYSGQSDLMVGSPIANRNQAAIEPLIGFFANTLAL RINLS SEQID TYTELNHRANQLAHYLQTLGVGAEVLVGISLERSLEMIIGLLGILKVGGAYLPLDPDYPT NO. 14 A3 ERLQLMLEDSQVPFLITHSSLLAKLPPSQATLICLDHIQEQISQYSPDNLQCQLTPANLANVIYTSGSTGKPKGVMVEHKGLVNLALAQIQSFAVNHNSRVLQFASFSFDACISEILMTFGSGATLYLAQKDALLPGQPLIERLVKNGITHVTLPPSALVVLPQEPLRNLETLIVAGEACSLDLVKQWSIDRNFFNAYGPTEASVCATIGQCYQDDLKVTIGKAIANVQIYILDAFLQPVPVGVSGELYIGGVGVARGYLNRPELTQEKFIANPFSNDPDSRLYKTGDLARYLPDGNIEYLGRIDNQVKIRGFRIELGEIEAVLSQCPDVQNTAV SEQ IDEILAQIWGQVLKIERVSREDNFFELGGHSLLATQVMSRLRETFQVELPLRSLFTAPTIAE NO. 15 T 3LALTI SEQ IDNDSANLPLSFAQQRLWFLDQLEPNSAFYHVGGAVRLEGTLNITALEQSLKEIINRHEALR NO. 16 C 4TNFITIDGQATQIIHPTINWRLSVVDCQNLTDTQSLEIAEAEKPFNLAQDCLFRATLFVRSPLEYHLLVTMHHIVSDGWSIGVFFQELTHLYAVYNQGLPSSLTPIKIQYADFAVWQRNWLQGEILSNQLNYWREQLANAPAFLPLPTDRPRPAIQTFIGSHQEFKLSQPLSQKLNQLSQKHGVTLFMTLLAAFATLLYRYTGQADILVGSPIANRNRKEIEGLIGFFVNTLVLRLSLD SEQ IDTYAELNHQANQLVHYLQTLGIGPEVLVAISVERSLEMIIGLLAILKACGAYLPLAPDYPT NO. 17 A 4ERLQFMLEDSQASFLITHSSLLEKLPSSQATLICLDHIQEQISQYSPDNLQSELTPSNLANVIYTSGSTGKPKGVMVEHRGLVNLASSQIQSFAVKNNSRVLQFASFSFDACISEILMTFGSGATLYLAQKNDLLPGQPLMERLEKNKITHVTLPPSALAVLPKKPLPNLQTLIVAGEACPLDLVKQWSVGRNFFNAYGPTETSVCATIGQCYQDDLKVTIGKAIANVQIYILDAFLQPVPIGVPGELYIGGVGVARGYLNRPELTAERFIPNPFDPPLTPLKKGGDKSYETFKKGEEQPSKLYKTGDLARYLPDGNIEYLGRIDNQVKIRGFRIELGEIEAVLSQCPDVQNTAV SEQ IDLQLAQIWSEILGINNIGIQENFFELGGHSLLAVSLINRIEQKLDKRLPLTSLFQNGTIAS NO. 18 T 4LAQLL SEQ IDTPFFAVHPIGGNVLCYADLARNLGTKQPFYGLQSLGLSELEKTVASIEEMAMIYIEAIQT NO. 19VQASGPYYLGGWSMGGVIAFEIAQQLLTQGQEVALLALIDSYSPSLLNSVNREKNSANSL TETEEFNEDINIAYSFIRDLASIFNQEISFSGSELAHFTSDELLDKFITWSQETNLLPSDFGKQQVKTWFKVFQINHQALSSYSPKTYLGRSVFLGAEDSSIKNPGWHQ SEQ IDFSLYYFGSYEAEFNPNKYNLLFEGAKFGDRAGFTALWIPERHFHAFGGFSPNPSVLAAAL NO. 20ARETKQIQLRSGSVVLPLHNSIRVAEEWAVVDNLSQGRVGIAFASGWHPQDFVLAPQSFG MOQHRELMFQEIETVQKLWRGEAITVPDGKGQRVEVKTYPQPMQSQLPSWITIVNNPDTYIRAGAIGANILTNLMGQSVEDLARNIALYRQSLAEHGYDPASGTVTVLLHTFVGKDLEQVREQARQPFGQYLTSSVGLLQNMVKSQGMKVDFEQLRDEDRDFLLASAYKRYTETSALIGTPESCRQIIDHLQSIGVDEVACFIDFGVDEQTVLANLPYLQSLKDLYQ SEQ IDIDPPLTPLDKGIDPPLTPLDKGIDPPLTPLDKG NO. 21 SP 1 SEQ IDPYQGGLGGDQSPYQGGLGGDQSPYQGGLGGDQSPYQGGLGGDQSPYQGGLGGDQSPYQGE NO. 22LGGDQSPYQGGLGGDQV SP 2 SEQ IDPASEMREWVENTVSRILAFQPERGLEIGCGTGLLLSRVAKHCLEYWATDYSQGAIQYVER NO. 23VCNAVEGLEQVKLRCQMADNFEGIALHQFDTVVLNSIIQYFPSVDYLLQVLEGAINVIGE MTRGQIFVGDVRSLPLLEPYHAAVQLAQASDSKTVEQWQQQVRQSVAGEEELVIDPTLFLALKQHFPQISWVEIQPKRGVAHNELTQFRYDVTLHLETINNQALLSGNPTVITWLNWQLDQLSLTQIKDKLLTDKPELWGIRGIPNQRVEEALKIWEWVENAPDVETVEQLKKLLKQQVDTGINPEQVWQLAESLGYTAHLSWWESSQDGSFDVIFQRNSEAEDSKKLTLSKLAFWDEKPFKIKPWSDYTNNPLRGKLVQKLIP SEQ IDMTNYGKSMSHYYDLVVGHKGYNKDYATEVEFIHNLVETYTTEAKSILYLGCGTGYHAALL NO. 24AQKGYSVHGVDLSAEMLEQAKTRIEDETIASNLSFSQGNICEIRLNRQFNVVLALFHVVN MT 2YQTTNQNLLATFATVKNHLKAGGIFICDVSYGSYVLGEFKSRPTASILRLEDNSNGNEVTYISELNFLTHENIVEVTHNLWVTNQENQLLENSRETHLQRYLFKPEVELLADACELTVLDAMPWLEQRPLTNIPCPSVCFVIGHKTTHSA SEQ IDATGACTATTAACTATGGTGATCTGCAAGAACCCTTTAATAAATTCTCAACCCTAGTTGAA NO. 25TTACTCCGTTATCGGGCAAGCAGTCAACCGGAACGCCTCGCCTATATTTTTCTGCGAGAC A* nuclGGAGAAATCGAAGAAGCTCGTTTAACCTATGGGGAACTGGATCAAAAGGCTAGGGCGATC acidGCCGCTTATCTACAATCCTTAGAAGCCGAGGGCGAAAGGGGTTTACTGCTCTATCCCCCAGGACTAGATTTTATTTCAGCTTTTTTTGGTTGTTTATATGCGGGAGTCGTTGCCATTCCCGCCTATCCACCCCGACGGAATCAAAACCTTTTGCGTTTACAGGCGATTATTGCCGATTCTCAAGCCCGATTTACCTTCACCAATGCCGCTCTATTTCCCAGTTTAAAAAACCAATGGGCTAAAGACCCTGAATTAGGAGCAATGGAATGGATTGTTACCGATGAAATTGACCATCACCTCAGGGAGGATTGGCTAGAACCAACCCTCGAAAAAAACAGTCTCGCTTTTCTACAATACACCTCTGGTTCAACGGGAACTCCAAAGGGAGTAATGGTCAGTCACCATAATTTGTTGATTAATTCAGCCGATTTAGATCGTGGTTGGGGCCATGATCAAGATAGCGTAATGGTCACTTGGCTACCGACCTTCCATGATATGGGTCTGATTTATGGGGTTATTCAGCCTTTGTACAAAGGATTTCTTTGTTACATGATGTCCCCTGCCAGCTTTATGGAACGACCGTTACGTTGGTTACAGGCCCTTTCTGATAAAAAAGCAACCCATAGTGCGGCCCCCAACTTTGCCTACGATCTTTGTGTGCGGAAAATTCCCCCTGAAAAACGGGCTACGTTAGACTTAAGCCATTGGTGCATGGCCTTAAATGGGGCCGAACCCGTCAGAGCGGAGGTACTTAAAAAGTTTGCGGAGGCCTTTTCAAGTTTCTGGTTTCAAAGCCACAGCCCTTTGTCCTGGCTACGGTTTAGCAGAAGCCACCCTGAAAGTTACGGCGGTTAGTTATGACAGTCCCCCTTACTTTTATCCCGTTCAGGCTAATGCTTTAGAAAAAAATAAGATTGTGGGAGCCACTGAAACCGATACCAATGTGCAGACCCTCGTGGGCTGCGGCTGGACAACGATTGATACTCAAATCGTCATTGTCAATCCTGAAACCCTGAAACCTTGCTCCCCTGAAATTGTCGGCGAAATTTGGGTATCAGGTTCAACAATCGCCCAAGGCTATTGGGGAAAACCTCAAGAGACTCAGGAAACCTTTCAAGCTTATTTGGCAGATACAGGAGCCGGGCCTTTTCTGCGAACAGGAGACTTGGGCTTCATTAAAGATGGTGAATTGTTTATCACAGGTCGGCTCAAGGAAATTATTCTGATTCGAGGACGCAATAATTATCCCCAGGATATTGAATTAACCGTCCAAAATAGTCATCCCGCTCTGCGTCCCAGTTGTGGGGCTGCTTTTACCGTTGAAAATAAGGGCGAAGAAAAGCTCGTGGTCGTTCAGGAAGTGGAGCGCACCTGGCTCCGTAAGGTAGATATAGATGAGGTAAAAAGAGCCATTCGTAAAGCTGTTGTCCAGGAATATGATTTACAGGTTTATGCGATCGCGCTGATCAGGACTGGCAGTTTACCAAAAACCTCTAGCGGTAAAATTCAGCGTCGTAGCTGTCGGGCCAAATTTTTAGAGGGAAGCCTGGAAATTTTGGGC TAA SEQ IDATGTCCACAGAAATCCCAAACGACAAAAAACAACCGACCCTAACGAAAATTCAAAACTGG NO. 26TTAGTGGCTTACATGACAGAGATGATGGAAGTGGACGAAGATGAGATTGATCTGAGCGTT ACP nuclCCCTTTGATGAATATGGTCTCGATTCTTCTATGGCAGTTGCTTTGATCGCTGATCTAGAG acidGATTGGTTACGACGAGATTTACATCGCACCCTGATCTACGATTATCCAACTCTAGAAAAGTTGGCTAAACAGGTTAGTGAACCCTGA SEQ IDATGGAACCCATCGCAATTATTGGTCTTGCTTGCCGCTTTCCAGGGGCTGACAATCCAGAA NO. 27GCTTTCTGGCAACTCATGCGAAATGGGGTGGATGCGATCGCCGATATTCCTCCTGAACGT KS nuclTGGGATATTGAGCGTTTCTACGATCCCACACCTGCCACTGCCAAGAAGATGTATAGTCGC acidCAGGGCGGTTTTCTAAAAAATGTCGATCAATTTGACCCTCAATTTTTCCGAATTTCTCCCCTAGAAGCCACCTATCTAGATCCTCAACAAAGACTGCTACTGGAAGTCACCTGGGAAGCCTTAGAAAATGCTGCCATTGTGCCTGAAACCTTAGCTGGTAGCCAATCAGGGGTTTTTATTGGTATCAGTGATGTGGATTATCATCGTTTGGCTTATCAAAGTCCTACTAACTTGACCGCCTATGTGGGTACAGGCAACAGCACCAGTATTGCGGCTAACCGTTTATCATATCTGTTTGATTTGCGTGGCCCCAGTTTGGCCGTAGATACCGCTTGCTCTTCTTCCCTCGTCGCCGTTCACTTGGCCTGTCAGAGTTTGCAAAGTCAAGAATCGAACCTCTGCTTAGTGGGGGGAGTTAATCTCATTTTGTCGCCAGAGACAACCGTTGTTTTTTCCCAAGCGAGAATGATCGCCCCCGACAGTCGTTGTAAAACCTTTGACGCGAGGGCCGATGGTTATGTGCGCTCGGAAGGCTGTGGAGTAGTCGTACTTAAACGTCTTAGGGATGCCATTCAGGACGGCGATCGCATTTTAGCAGTGATTGAAGGTTCCGCGGTGAATCAGGATGGTTTAAGTAATGGACTCACGGCCCCTAATGGCCCTGCTCAACAGGCGGTGATTCGTCAGGCCCTGGCAAATGCCCAGGTAAAACCGGCCCAGATTAGCTATGTCGAAGCCCATGGCACGGGGACAGAATTGGGGGATCCGATCGAAGTTAAATCTCTGAAAGCGGTTTTGGGTGAAAAGCGATCGCTCGATCAAACCTGTTGGCTCGGTTCTGTGAAAACCAACATTGGTCATTTAGAAGCGGCGGCGGGAATGGCGGGTCTGATTAAAGTCGTTCTCTGCCTACAACACCAAGAAATTCCCCCTAATCTCCACTTTCAAACCCTTAATCCCTATATTTCCCTAGCTGACACAGCTTTTGCGATTCCCACTCAGGCTCAACCCTGGCGGACCAAACCCCCTAAGTCTGGTGAAAACGGTGTCGAACGACGTTTAGCAGGACTCAGTTCCTTTGGGTTTGGGGGGACAAATTCCCATGTGATTCTC SEQ IDGTTTTTCTATTTGCCGGTCAAGGTTCTCAATATGTAGGTATGGGTCGTCAACTGTACGAA NO. 28ACCCAACCCATCTTTCGCCAAACCTTGGATCGCTGTGCTGAAATCCTGCGACCCCATTTA AT nuclGATCAACCCCTCTTAGAAATTCTTTATCCTGCTGACCCAGAAGCCGAAACAGCGAGTTTT acidTACCTAGAGCAGACTGCCTATACCCAACCCACTTTATTCGCATTCGAGTATGCCCTAGCACAGTTATGGCGTTCCTGGGGAATAGAACCGGCGGCAGTAATTGGTCACAGTGTCGGTGAATATGTGGCGGCCACCGTTGCCGGAGCCTTAAGTCTAGAAGAAGGATTAACGCTAATTGCCAAACGGGCAAAACTGATGCAGTCTCTCCCCAAGAATGGGACAATGATCGCCGTTTTTGCCGCAGAAGAGCGGGTTAAAGCTGTTATTGAGCCTTATAGGACTGATGTAGCGATCGCTGCTGTTAATGGACCAGAAAATTTTGTTATTTCAGGAAAAGCGCCGATTATTGCTGAGATTATCATTCATTTAACGGCAGCAGGAATAGAAGTTCGTCCTCTCAAAGTTTCCCATGCTTTTCACTCGCACCTGTTGGAGCCAATTTTAGATTCCTTAGAACAGGAAGCTGCTGCTATTTCCTACCAACCCCTGCAAATTCCCTTAGTTGCTAATTTAACGGGGGAAGTTCTACCAGAAGGAGCAACGATTGAGGCTCGTTACTGGCGAAATCATGCACGCAACCCTGTACAATTTTATGGGAGTATCCAAACGCTGATCGAGCAGAAATTCAGTCTTTTTTTAGAAGTTAGCCCTAAACCGACTTTATCTCGATTGGGTCAACAATGTTGTCCAGAAAGATCGACCACTTGGCTATTTTCCCTCGCCCCTCCTCAAGAAGAAGAACAAAGCCTACTAAATAGTTTGGCGATTCTCTATGATTCCCAAGGAGCCGAA SEQ IDATCACATTGCAAACCCTAGTGGGAAATTTACTGCAATTGTCCCCTGCTGATGTCAATGTT NO. 29CATACACCTTTCCTGGAGATGGGGGCAGATTCCATTGTCATGGTTGAGGCGGTCAGACGG ACP 2ATTGAGAATACCTATAACGTTAAAATTGCTATGCGTCAGTTATTTGAGGAGTTATCTACT nucl acidTTAGATGCTTTAGCTACTTATTTA SEQ IDAAAGAGATGCTTTATCCCATTGTGGCCCAACGTTCTCAAGGATCAAGAATTTGGGATGTG NO. 30GACGGTAATGAATATATTGATATGACGATGGGGCAAGGGGTAACGCTGTTTGGGCATCAA AMTCCAGACTTCATTATGTCGGCCCTACAAAGCCAACTCACTGAAGGCATTCATCTCAATCCG nucl acidCGATCGCCAATTGTGGGAGAAGTGGCCGCCTTAATTTGTGAACTAACAGGAGCCGAACGAGCTTGTTTTTGCAACTCTGGAACCGAAGCCGTAATGGCCGCTATTCGTATCGCCAGGGCAACAACAGGTCGGAGTAAAATTGCCCTCTTTGAAGGCTCCTATCATGGACATGCGGACGGAACCCTTTTTAGGAACCAAATTATTGATAACCAACTCCACTCTTTTCCCCTAGCTCTAGGCGTTCCCCCCAGCCTTAGTTCCGATGTGGTGGTATTGGACTATGGCAGTGCGGAAGCTCTGAACTATTTACAAACCCAGGGGCAGGATTTAGCGGCGGTCTTAGTAGAACCAATTCAAAGTGGCAATCCTCTACTCCAACCCCAACAATTTCTCCAAAGTCTGCGACAAATTACCAGTCAAATGGGCATTGCCCTGATTTTTGATGAAATGATTACGGGTTTTCGATCGCACCCAGGGGGAGCGCAAGCTTTATTTGGAGTACAGGCGGATATTGCCACCTATGGCAAAGTAGTTGCGGGAGGAATGCCCATTGGAGTTATTGCAGGTAAGGCCCATTATCTGGACAGCATTGACGGGGGAATGTGGCGTTATGGCGATAAATCCTATCCTGGGGTGGACAGAACCTTTTTTGGGGGAACCTTTAATCAGCATCCGTTAGCAATGGTAGCGGCTAGGGCTGTCCTGACCCATTTAAAGGAGCAGGGGCCAGGTCTGCAACAACAATTAACTGAACGCACTGCGGCCTTAGCCGATACACTGAATCATTATTTTCAAGCCGAAGAAGTTCCTATTAAAATCGAACAGTTTAGTTCTTTCTTCCGGTTTGCCCTCTCTGGCAATTTGGATTTACTTTTCTATCACATGGTAGAAAAAGGTATTTATGTCTGGGAATGGCGTAAACATTTTCTTTCAACCGCCCATACGGAAGCCGATCTTGCCCAATTTGTCCAAGCGGTTAAGGATAGCATCACAGAATTGCGT SEQ IDGGGGGGGATCAAGTCCCTCTCACCGAAGCCCAACGACAACTGTGGATTTTGGCTCAATTA NO. 31 CGGAGACAACGGCTCTGTGGCCTATAACCAATCAGTGACATTGCAATTAAGTGGCCCATTA nucl acidAATCCCGTCGCAATGAATCAAGCTATTCAACAAATCAGCGATCGCCATGAAGCGTTACGAACCAAAATTAATGCCCAGGGAGATAGTCAAGAAATCCTGCCCCAGGTCGAAATTAACTGCCCTATCTTAGACTTCAGTCTTGACCAAGCTTCGGCCCAACAGCAAGCAGAACAATGGTTAAAGGAAGAAAGTGAAAAACCCTTTGATTTGAGCCAGGGTTCTCTCGTGCGTTGGCATCTACTCAAATTAGAACCAGAATTACATTTGTTAGTATTAACGGCCCATCACATTATCAGTGACGGTTGGTCAATGGGGGTAATCCTTCGGGAATTAGGAGAGTTATATTCAGCCAAATGTCAGGGTGTTACGGCTAATCTTAAAACCCCAAAACAGTTTCGAGAATTGATTGAATGGCAAGCCAGCCAAGCCAAGGGGAAGAACTGAAAAAACAGCAAGCCTATTGGTTAGCAACCCTTGCCGATCCCCCTGTTTTGAATTTACCCACTGACAAACCTCGTCCAGCTTTACCCAGTTACCAAGCTAATCGTCGAAGTCTAACTTTAGATAGCCAATTTACAGAAAAACTAAAGCAATTTAGTCGTAAACAGGGCTGTACCTTGCTGATGACCCTGTTATCGGTTTATAACATTCTCGTTCATCGTTTGACGGGACAGGATGATATTCTGGTGGGTCTGCCAGCCTCTGGACGGGGGCTTTTAGATAGTGAAGGTATGGTGGGTTATTGCACCCATTTTTTACCAATTCGCAGTCAATTAGCA SEQ IDACTTACAGTGAATTAAATTGTCGAGCCAATCAGTTAGCACATTATTTACAAAAATTAGGA NO. 32 AGTTGGGCCAGAGGTCTTAGTCGGTATTTTGGTCGAACGTTCTTTAGAAATGATTGTCGGA nucl acidTTGTTAGGGATTCTCAAGGCTGGGGGAGCCTATGTACCTCTTGATCCTGACTATCCCCCTGAACGTCTTCAATTTATGTTAGAAGATAGTCAATTTTTTCTCCTCTTAACCCAACAGCATTTACTGGAATCTTTTGCTCAGTCTTCAGAAACGGCTACTCCCAAGATTATTTGTTTGGATAGCGACTACCAAATTATTTCCCAGGCAAAGAATATTAATCCCGAAAATTCAGTCACAACGAGTAATCTTGCCTATGTAATTTATACCTCTGGTTCGACAGGTAAACCGAAGGGCGTGATGAATAATCATGTTGCTATTAGTAATAAATTGTTATGGGTACAAGACACTTATCCTCTAACCACAGAAGACTGTATTTTACAAAAAACTCCCTTTAGTTTTGATGTTTCAGTGTGGGAATTATTCTGGCCCCTACTAAACGGAGCGCGTTTGGTTTTTGCCAAGCCGAATGGCCATAAAGATGCCAGTTACTTAGTCAATCTGATTCAAGAGCAACAAGTAACAACGCTACATTTTGTGTCTTCTATGCTACAGCTTTTTCTGACAGAAAAAGACGTAGAAAAATGTAATAGTCTTAAACGAGTCATTTGTAGTGGTGAAGCCCTTTCTTTAGAGCTTCAAGAACGTTTTTTTGCTCGTTTAGTCTGTGAATTACACAATCTTTATGGACCGACAGAAGCCGCTATTCATGTCACATTTTGGCAATGTCAATCAGATAGCAATTTGAAAACAGTACCCATTGGTCGGCCGATCGCTAATATCCAAATTTACATTTTAGACTCTCATCTTCAGCCAGTACCTATTGGAGTAATCGGAGAATTGCACATTGGTGGGGTTGGTTTGGCGCGGGGTTATTTAAACAGGCCTGAGTTAACGGCGGAGAAATTTATTGCAAATCCGTTTGCTTCCCTTGATCCCCCCCTAACCCCCCTTGATAAGGGGGGAGATGAGAGCTATAAAACTTTTAAAAAGGGGGGAGAGCAACCATCAAGATTGTATAAAACGGGAGATTTAGCTCGTTATTTACCCGATGGCAAGATTGAGTATCTAGGGCGCATTGATAATCAGGTAAAAATTCGCGGTTTCCGGATTGAATTGGGGGAAATTGAAGCGGTTTTGCTATCCCATCCCCAGGTACGAGAAGCGGTCGTT SEQ IDGAGGCGATCGCCGCTATTTTTGGTCAAGTTTTAAAACTGGAAAAAGTGGGAATTTATGAT NO. 33 TAACTTTTTTGAGATCGGCGGTAATTCTTTGCAAGCCACTCAAGTTATTTCACGCTTACGA nucl acidGAAAGTTTTGCCCTAGAGTTGCCCTTGCGTCGCCTGTTTGAACAACCGACTGTGGCGGATTTGGCTTTAGCCGTA SEQ IDCCTCGTGATGGCCAATTACCCCTCTCCTTTGCCCAGTCGCGACTCTGGTTCTTGTATCAA NO. 34 CTTAGAAGGAGCCACGGGAACCTATAACATGACAGGGGCCTTGAGTTTAAGCGGGCCTCTT 2 nuclCAGGTCGAAGCCCTCAAACAAGCCCTAAGAACTATCATTCAACGCCATGAGCCATTGCGT acidACCAGTTTCCAATCGGTTGACGGGGTTCCAGTGCAGGTGATTAATCCCTATCCTGTTTGGGAATTAGCGATGGTTGATTTGACAGGAAAGGAGACAGAAGCAGAAAAATTGGCCTATCAGGAATCCCAAACCCCGTTTGATTTGACCAATAGTCCTTTGTTGAGGGTAACGCTCCTCAAATTACAGCCAGAAAAGCATATTTTATTAATTAATATGCACCATATTATTTCCGATGGCTGGTCAATCGGTGTTTTTGTTCGTGAATTGTCCCATCTCTATAGGGCTTTTGTGGCGGGTAAAGAACCAACTTTACCGATTTTACCAATTCAGTATGCGGATTTTGCCGTTTGGCAGCGAGAGTGGTTACAGGGTAAGGTTTTAGCGGCTCAATTGGAATATTGGAAGCGACAATTGGCAGATGCTCCTCCTCTGCTGGAACTGCCCACTGATCGCCCTCGTCCCGCAATCCAAACCTTTCAAGGCAAGACAGAAAGATTTGAGCTAGATAGGAAACTGACCCAAGAATTAAAGGCATTAAGTCAACAGTCGGGTTGTACTTTATTTATGACTTTGTTGGCCGCTTTTGGGGTGGTTTTATCCCGTTATAGTGGCCAGACTGATATCGTCATTGGTTCGGCGATCGCCAACCGTAATCGCCAAGACATTGAGGGGTTAATTGGCTTTTTTGTTAACACTTTGGCGTTGAGGTTAGATTTATCA SEQ IDACCTATGGAGAATTAAACCATCGCGCCAATCAATTAGCTCACTATCTTCAGTCGTTAGGA NO. 35 AGTCACCAAAGAACAAATCGTCGGGGTTTATCTGGAACGTTCCCTTGAAATGGCGATCGGA 2 nuclTTTTTAGGTATTCTCAAAGCAGGAGCCGCCTATCTCCCCATTGATCCTGAATATCCCTCA acidGTACGCACCCAATTTATTCTCGAAGATACCCAACTTTCGCTTCTCTTAACTCAGGCAGAACTGGCAGAAAAACTGCCCCAGACTCAAAACAAAATTATCTGTCTAGATCGGGACTGGCCAGAAATTACCTCCCAACCCCAGACAAACCTAGACCTAAAGATAGAACCTAATAACCTAGCCTATTGCATCTATACTTCTGGTTCCACAGGACAACCCAAAGGAGTACTGATTTCCCATCAAGCCCTACTCAACTTAATTTTCTGGCATCAACAAGCGTTTGAGATTGGCCCCTTACATAAAGCGACCCAAGTGGCAGGCATTGCTTTCGATGCAACGGTTTGGGAATTGTGGCCCTATCTGACCACAGGAGCCTGTATTAATCTGGTTCCCCAAAATATTCTGCTCTCACCGACGGATTTACGGGATTGGTTGCTTAACCGAGAAATTACCATGAGTTTTGTGCCAACTCCTTTAGCTGAAAAATTATTATCCTTGGATTGGCCTAACCATTCTTGTCTAAAAACCCTGTTACTGGGAGGTGACAAACTTCATTTTTATCCTGCTGCGTCCCTTCCCTTTCAGGTCATTAACAACTATGGCCCAACGGAAAATACAGTGGTTGCGACCTCTGGACTGGTCAAATCATCTTCATCTCATCACTTTGGAACTCCGACTATTGGTCGTCCCATTGCCAACGTCCAAATCTATTTATTAGACCAAAACCTACAACCTGTCCCCATTGGTGTACCAGGAGAATTACATTTAGGTGGGGCGGGTTTAGCGCAGGGCTATCTCAATCGTCCTGAGTTAACGGCTGAAAAATTTATTGCCAATCCCTTTGATCCCCCCCTAACCCCCCTTGATAAGGGGGGAGAAGAACCCTCAAAACTCTATAAAACGGGAGACTTAGCCCGTTATTTACCCGATGGCAATGTAGAATTTTTGGGACGTATTGACAATCAGGTAAAAATTCGGGGTTTTCGCATCGAAACTGGGGAAATCGAAGCCGTTTTAAGTCAATATTTCCTATTAGCTGAAAGTGTAGTC SEQ IDGCTCAACTGACTCAAATTTGGAGTGAAGTTTTGGGACTGGAACGCATTGGCGTTAAGGAC NO. 36 TAACTTTTTTGAATTGGGAGGACATTCTCTTTTGGCTACCCAGGTTTTATCAAGAATTAAT 2 nuclTCAGCCTTTGGACTTGATCTTTCTGTGCAAATTATGTTTGAATCACCAACGATCGCGGGC acidATTGCGGGTTATATT SEQ IDGCTAGAGACGGTCATTTACCCCTGTCTTTTGCTCAACAACGTTTATGGTTTTTACATTAT NO. 37 CCTTTCCCCTGATAGTCGTTCCTACAATACCCTGGAAATATTGCAAATTGATGGGAATCTC 3 nuclAATCTGACTGTGCTAGAGCAGAGTTTGGGGGAATTAATTAACCGCCATGAAATTTTTAGA acidACAACATTCCCCACTGTTTCAGGGGAACCGATTCAGAAAATTGCACTTCCTAGTCGTTTTCAGTTAAAAGTTGATAATTATCAAGATTTAGACGAAAATGAACAATCAGCTAAAATTCAACAAGTAGCAGAATTGGAAGCAGGACAAGCTTTTGATTTAACGGTGGGGCCACTGATTCAGTTTAAGCTATTGCAATTGAGTCCCCAGAAGTCGGTGCTGCTGTTGAAAATGCACCATATTATCTATGATGGCTGGTCTTTTGGGATTCTGATTCGGGAATTATCGGCTCTATACGAAGCATTTTTAAAGAACTTAGCCAATCCTCTCCCTGCGTTGTCTATTCAGTATGCAGATTTTGCGGTTTGGCAACGTCAATATCTCTCAGGTGAGGTCTTAGATAAACAACTCAATTATTGGCAAGAACAGTTAGCAACAGTCTCTCCTGTTCTTACTTTACCAACGGATAGACCCCGTCCGGCGATACAAACTTTTCAGGGAGGAGTTGAGCGTTTTCAACTGGATCAAAATGTCACTCAAGGTCTTAAAAAGTTAGGTCAAGATCAGGTTGCAACCCTGTTTATGACGTTGTTGGCCGGTTTCGGCGTTTTGCTATCTCGTTATAGTGGTCAATCTGATCTGATGGTGGGTTCTCCGATCGCTAATCGTAATCAAGCAGCGATCGAACCTTTAATTGGCTTTTTTGCTAACACTTTGGCTTTAAGAATTAATTTATCA SEQ IDACATACACTGAATTAAACCATCGCGCTAATCAGTTAGCCCATTATTTACAAACTTTAGGC NO. 38 AGTGGGAGCAGAAGTCTTAGTCGGTATTTCCCTAGAACGTTCTTTAGAGATGATTATCGGC 3 nuclTTATTAGGGATTCTCAAGGTAGGTGGTGCTTATCTTCCTCTTGATCCAGACTATCCCACT acidGAGCGTCTTCAGTTGATGTTAGAAGACAGTCAAGTTCCTTTTTTGATTACCCACAGTTCTTTATTAGCAAAATTGCCTCCCTCTCAAGCAACTCTGATTTGTTTAGATCATATCCAAGAGCAGATTTCTCAATATTCTCCAGATAATCTTCAATGTCAGTTAACTCCTGCCAATTTAGCTAACGTTATTTATACCTCTGGCTCTACGGGTAAGCCTAAAGGGGTGATGGTTGAACATAAAGGTTTAGTTAACTTAGCTCTTGCTCAAATTCAATCTTTTGCAGTCAACCATAACAGTCGTGTGCTGCAATTTGCTTCTTTTAGTTTTGATGCTTGTATTTCAGAAATTTTGATGACCTTTGGTTCTGGAGCGACGCTTTATCTTGCACAAAAAGATGCTTTATTGCCAGGTCAGCCATTAATTGAACGGTTAGTAAAGAATGGAATTACTCATGTGACTTTGCCGCCTTCAGCTTTAGTGGTTTTACCCCAGGAACCGTTACGCAACTTAGAAACCTTAATTGTGGCGGGTGAGGCTTGTTCTCTTGATTTAGTGAAACAATGGTCAATCGATAGAAACTTTTTCAATGCCTATGGGCCAACGGAAGCGAGTGTTTGTGCCACTATTGGACAATGTTATCAAGATGATTTAAAGGTGACGATTGGTAAGGCGATCGCCAATGTCCAAATTTATATTTTAGATGCCTTTTTACAGCCGGTGCCGGTGGGAGTGTCAGGAGAGTTATACATTGGTGGAGTTGGGGTGGCAAGGGGCTATTTAAATCGTCCTGAATTAACCCAAGAAAAATTTATTGCTAATCCTTTTAGTAACGACCCAGATTCTCGGCTCTATAAAACTGGCGACTTAGCGCGTTATTTACCCGATGGTAATATTGAATATTTAGGACGCATTGACAATCAGGTAAAAATTCGCGGTTTTCGCATTGAGTTAGGAGAAATTGAAGCGGTTCTGAGTCAATGTCCCGATGTGCAAAATACGGCGGTG SEQ IDGAAATTCTGGCTCAAATATGGGGGCAAGTTCTCAAGATAGAAAGAGTCAGCAGAGAAGAT NO. 39 TAATTTCTTTGAATTGGGGGGGCATTCCCTTTTAGCTACCCAGGTAATGTCCCGTCTGCGT 3 nuclGAAACTTTTCAAGTCGAATTACCTTTGCGTAGTCTCTTTACCGCTCCCACTATTGCTGAA acidTTGGCCCTAACAATT SEQ IDAACGACAGTGCTAACCTCCCGTTATCTTTTGCTCAACAACGTTTATGGTTTCTGGATCAA NO. 40 CTTAGAACCTAACAGCGCCTTTTATCATGTAGGGGGAGCCGTAAGACTAGAAGGAACATTA 4 nuclAATATTACTGCCTTAGAGCAAAGCTTAAAAGAAATTATTAATCGTCATGAAGCTTTACGC acidACAAATTTTATAACGATTGATGGTCAAGCCACTCAAATTATTCACCCTACTATTAATTGGCGATTGTCTGTTGTTGATTGTCAAAATTTAACCGACACTCAATCTCTGGAAATTGCGGAAGCTGAAAAGCCCTTTAATCTTGCTCAAGATTGCTTATTTCGTGCTACTTTATTCGTGCGATCACCGCTAGAATATCATCTACTCGTGACCATGCACCATATTGTTAGCGATGGCTGGTCAATTGGAGTATTTTTTCAAGAACTAACTCATCTTTACGCTGTCTATAATCAGGGTTTACCCTCATCTTTAACGCCTATTAAAATACAATATGCTGATTTTGCGGTCTGGCAACGGAATTGGTTACAAGGTGAAATTTTAAGTAATCAATTGAATTATTGGCGCGAACAATTAGCAAATGCTCCTGCTTTTTTACCTTTACCGACAGATAGACCTAGGCCCGCAATCCAAACTTTTATTGGTTCTCATCAAGAATTTAAACTTTCTCAGCCATTAAGCCAAAAATTGAATCAACTAAGTCAGAAGCATGGAGTGACTTTATTTATGACTCTCCTGGCTGCTTTTGCTACCTTACTTTACCGTTATACAGGACAAGCAGATATTTTAGTTGGTTCTCCTATTGCTAACCGTAATCGTAAGGAAATTGAGGGATTAATCGGCTTTTTTGTTAATACATTAGTTCTGAGATTGAGTTTAGAT SEQ IDACCTATGCTGAATTAAATCATCAAGCTAATCAGTTAGTCCATTACTTACAAACTTTAGGA NO. 41 AATTGGGCCAGAGGTCTTAGTCGCTATTTCAGTAGAACGTTCTTTAGAAATGATTATCGGC 4 nuclTTATTAGCCATTCTCAAGGCGTGTGGTGCTTATCTCCCTCTTGCTCCTGACTATCCCACT acidGAGCGTCTTCAGTTCATGTTAGAAGATAGTCAAGCTTCTTTTTTGATTACCCACAGTTCTTTATTAGAAAAATTGCCTTCTTCTCAAGCGACTCTAATTTGTTTAGATCACATCCAAGAGCAGATTTCTCAATATTCTCCCGATAATCTTCAAAGTGAGTTAACTCCTTCCAATTTGGCTAACGTTATTTACACCTCTGGCTCTACGGGTAAGCCTAAAGGGGTGATGGTTGAACATCGGGGCTTAGTTAACTTAGCGAGTTCTCAAATTCAATCTTTTGCAGTCAAAAATAACAGTCGTGTACTGCAATTTGCTTCCTTTAGTTTTGATGCTTGTATTTCAGAAATTTTGATGACCTTTGGTTCTGGAGCGACTCTTTATCTTGCTCAAAAAAATGATTTATTGCCAGGTCAGCCATTAATGGAAAGGTTAGAAAAGAATAAAATTACCCATGTTACTTTACCCCCTTCAGCTTTAGCTGTTTTACCAAAAAAACCGTTACCCAACTTACAAACTTTAATTGTGGCGGGTGAGGCTTGTCCTCTGGATTTAGTCAAACAATGGTCAGTCGGTAGAAACTTTTTCAATGCCTATGGCCCGACAGAAACGAGTGTTTGTGCCACGATTGGACAATGTTATCAAGATGATTTAAAGGTCACGATTGGTAAGGCGATCGCTAATGTCCAAATTTATATTTTGGATGCCTTTTTACAACCAGTACCCATCGGAGTACCAGGGGAATTATACATTGGTGGAGTCGGAGTTGCGAGGGGTTATCTAAATCGTCCTGAATTAACGGCGGAAAGATTTATTCCTAATCCTTTTGATCCCCCCCTAACCCCCCTTAAAAAGGGGGGAGATAAGAGCTATGAAACTTTTAAAAAGGGGGAAGAGCAACCATCAAAACTCTATAAAACGGGAGATTTAGCTCGTTATTTACCCGATGGCAATATTGAATATTTAGGACGCATTGACAATCAGGTAAAAATTCGCGGTTTTCGCATTGAGTTAGGAGAAATTGAAGCGGTTCTGAGTCAATGTCCCGATGTGCAAAATACGGCGGTG SEQ IDTTACAATTAGCTCAAATCTGGTCAGAGATTTTAGGCATTAATAATATTGGTATTCAGGAA NO. 42 TAACTTCTTTGAATTAGGCGGTCATTCTTTATTAGCAGTCAGTCTGATCAATCGTATTGAA 4 nuclCAAAAGTTAGATAAACGTTTACCATTAACCAGTCTTTTTCAAAATGGAACCATAGCAAGT acidCTAGCTCAATTACTAG SEQ IDACTCCATTTTTTGCTGTTCATCCCATTGGTGGTAATGTGCTATGTTATGCCGATTTAGCT NO. 43CGTAATTTAGGAACGAAACAGCCGTTTTATGGATTACAATCATTAGGGCTAAGTGAATTA TE nuclGAAAAAACTGTAGCCTCTATTGAAGAAATGGCGATGATTTATATTGAAGCAATACAAACT acidGTTCAAGCCTCTGGTCCCTACTATTTAGGAGGTTGGTCAATGGGAGGAGTGATAGCTTTTGAAATCGCCCAACAATTATTGACCCAAGGTCAAGAAGTTGCTTTACTGGCTTTAATAGATAGTTATTCTCCCAGTTTACTTAATTCAGTTAATAGGGAGAAAAATTCTGCTAATTCCCTGACAGAAGAATTTAATGAAGATATCAATATTGCCTATTCTTTCATCAGAGACTTAGCAAGTATATTTAATCAAGAAATCTCTTTCTCTGGGAGTGAACTTGCTCATTTTACATCAGACGAATTACTAGACAAGTTTATTACTTGGAGTCAAGAGACGAATCTTTTGCCGTCAGATTTTGGGAAGCAGCAGGTTAAAACCTGGTTTAAAGTTTTCCAGATTAATCACCAAGCTTTGAGCAGCTATTCTCCCAAGACGTATCTGGGTAGAAGTGTTTTCTTAGGAGCGGAAGACAGTTCTATTAAAAATCCTGGTTGGCATCAA SEQ IDAGCGGGTCTCAAGACCAAAAAACGATACAGTTTAGCCTCTACTACTTTGGTAGCTATGAA NO. 44GCGGAATTTAACCCGAATAAATATAACTTACTGTTTGAAGGAGCTAAATTTGGCGATCGC MO nuclGCTGGTTTTACGGCCCTTTGGATTCCTGAACGTCATTTCCACGCTTTTGGTGGTTTTTCT acidCCCAATCCTTCGGTTTTGGCGGCGGCTTTAGCACGGGAAACCAAACAGATTCAACTGCGATCAGGCAGTGTGGTTTTACCGCTACATAATTCCATCCGAGTCGCCGAAGAATGGGCAGTGGTGGACAATCTTTCCCAGGGCCGCGTTGGTATTGCTTTTGCATCGGGTTGGCATCCCCAGGATTTTGTCTTGGCTCCCCAGTCCTTTGGCCAACATCGGGAATTGATGTTCCAAGAAATTGAAACCGTCCAGAAACTTTGGCGAGGGGAAGCGATCACCGTGCCAGACGGAAAGGGTCAAAGGGTAGAGGTTAAAACCTATCCCCAACCGATGCAGTCCCAGTTACCCAGCTGGATTACTATTGTCAATAATCCCGATACCTATATCAGAGCAGGGGCGATCGGTGCTAATATCCTTACCAATCTGATGGGGCAAAGCGTGGAAGATTTAGCCCGTAATATTGCGCTATATCGTCAATCTTTGGCAGAGCATGGTTATGATCCCGCGTCGGGAACGGTGACAGTTCTCCTGCATACTTTTGTTGGCAAGGATTTAGAACAAGTTCGAGAACAGGCTCGCCAACCCTTTGGGCAATACCTCACCTCCTCTGTCGGACTCTTGCAGAACATGGTCAAGAGCCAGGGCATGAAAGTGGATTTTGAACAATTAAGAGACGAAGATCGGGACTTTCTCCTCGCTTCTGCCTATAAACGCTATACAGAAACCAGTGCTTTAATTGGCACACCCGAATCCTGTCGTCAAATTATTGATCATTTGCAGTCCATCGGTGTGGATGAAGTGGCTTGTTTTATTGATTTTGGGGTAGATGAACAAACAGTTTTGGCCAATTTACCCTATCTCCAGTCCCTAAAAGACTTATATCAA SEQ IDATTGATCCCCCCCTAACCCCCCTTGATAAGGGGATTGATCCCCCCCTAACCCCCCTTGAT NO. 45AAGGGGATTGATCCCCCCCTAACCCCCCTTGATAAGGGG SP 1 nucl acid SEQ IDCCTTATCAAGGGGGGTTAGGGGGGGATCAATCCCCTTATCAAGGGGGGTTAGGGGGGGAT NO. 46CAATCCCCTTATCAAGGGGGGTTAGGGGGTGATCAATCCCCTTATCAAGGGGGGTTAGGG SP 2 nuclGGTGATCAATCCCCTTATCAAGGGGGGTTAGGGGGGGATCAATCCCCTTATCAAGGAGAG acidTTAGGGGGGGATCAATCCCCTTATCAAGGGGGGTTAGGGGGGGATCAAGTC SEQ IDCCTGCTTCAGAAATGCGAGAGTGGGTCGAAAACACTGTTAGTCGCATCTTGGCTTTCCAA NO. 47CCAGAACGCGGTTTAGAAATTGGTTGTGGTACAGGTTTGTTACTCTCCAGGGTAGCAAAG MT nuclCATTGTCTTGAATATTGGGCAACGGATTATTCCCAAGGGGCGATCCAGTATGTTGAACGG acidGTTTGCAATGCCGTTGAAGGTTTAGAACAGGTTAAATTACGCTGTCAAATGGCAGATAATTTTGAAGGTATTGCCCTACATCAATTTGATACCGTCGTCTTAAATTCGATTATTCAGTATTTTCCCAGTGTGGATTATCTGTTACAGGTGCTTGAAGGGGCGATCAACGTCATTGGCGAGCGAGGTCAGATTTTTGTCGGGGATGTGCGGAGTTTACCCCTATTAGAGCCATATCATGCGGCTGTGCAATTAGCCCAAGCTTCTGACTCGAAAACTGTTGAACAATGGCAACAACAGGTGCGTCAAAGTGTAGCAGGTGAAGAAGAACTGGTCATTGATCCCACATTGTTCCTGGCTTTAAAACAACATTTTCCGCAAATTAGCTGGGTAGAAATTCAACCGAAACGGGGTGTGGCTCACAATGAGTTAACTCAATTTCGCTATGATGTCACTCTCCATTTAGAGACTATCAATAATCAAGCATTATTGAGCGGCAATCCAACGGTAATTACCTGGTTAAATTGGCAACTTGACCAACTGTCTTTAACACAAATTAAAGATAAATTATTAACAGACAAACCTGAATTGTGGGGAATTCGTGGTATTCCTAATCAGCGAGTTGAAGAGGCTCTAAAAATTTGGGAATGGGTGGAAAATGCCCCTGATGTTGAAACGGTTGAACAACTCAAAAAACTTCTCAAACAACAAGTAGATACTGGTATTAATCCTGAACAGGTTTGGCAATTAGCTGAGTCTCTCGGTTACACCGCTCACCTTAGTTGGTGGGAAAGTAGTCAAGACGGTTCCTTTGATGTCATTTTTCAGCGGAATTCAGAAGCGGAGGACTCAAAAAAATTAACCCTTTCAAAACTTGCTTTCTGGGATGAAAAACCCTTTAAAATAAAGCCCTGGAGTGACTATACTAACAACCCTCTGCGCGGTAAGTTAGTCCAAAAATTA ATTCCT SEQID ATGACAAATTATGGCAAATCTATGTCTCATTACTATGATCTAGTGGTAGGACATAAAGGT NO. 48TATAACAAAGATTACGCCACTGAAGTAGAATTCATTCACAATTTAGTTGAGACTTACACA MT 2ACTGAAGCCAAATCTATCCTATACTTGGGCTGTGGTACGGGTTATCATGCCGCTCTTTTA nucl acidGCACAGAAAGGGTATTCTGTACATGGTGTTGATCTCAGTGCTGAAATGTTAGAGCAGGCTAAAACTCGCATTGAAGATGAAACAATAGCTTCTAATCTGAGTTTTTCTCAAGGAAATATTTGTGAAATCCGTTTAAATCGTCAGTTTAATGTTGTTCTTGCTCTATTTCATGTGGTTAACTATCAAACGACCAATCAAAATTTACTGGCAACGTTTGCAACGGTTAAAAACCATTTAAAAGCTGGGGGGATTTTTATTTGTGATGTGTCCTATGGGTCTTACGTACTGGGGGAATTTAAGAGTCGGCCTACGGCATCAATATTGCGTTTAGAGGATAATTCCAATGGTAACGAAGTAACCTATATTAGTGAACTAAATTTTTTAACCCATGAAAATATAGTGGAAGTTACTCACAATTTATGGGTAACAAATCAAGAAAATCAACTTCTAGAGAATTCACGGGAAACACATCTTCAGCGCTATCTTTTCAAGCCTGAAGTTGAATTGTTGGCTGATGCTTGTGAACTAACTGTTCTTGATGCGATGCCCTGGCTTGAACAACGTCCTTTGACAAACATTCCTTGTCCTTCAGTTTGTTTTGTTATTGGGCATAAAACAACCCATTCAGCTTAA SEQ ID CCGACCTGTGATAAACAATTC NO. 49Primer A SEQ ID CKNCCDGTDATRAANARYTC NO. 50 Primer B SEQ IDTTCAATATCCTGGGGATA NO. 51 Primer C SEQ ID YTCDATRTCYTGNGGRTA NO. 52Primer D SEQ ID CGTTGGTTACAGGCCCTTTCT NO. 53 Primer E SEQ IDMGNTGGYTNCARGCNYTNWS NO. 54 Primer F SEQ ID TTAGACTTAAGCCATTGG NO. 55Primer G SEQ ID YTNGAYYTNWSNCAYTGG NO. 56 Primer H SEQ IDCATAGAAGAATCGAGACCATATTC NO. 57 Primer I SEQ ID CATNSWNSWRTCNARNCCRTAYTCNO. 58 Primer J SEQ IDMTTQTASSANALASFNQFLRDVKAIAQPYWYPTVSNKRSFSEVIRSWGMLSLLIFLIVGL NO. 59VAVTAFNSFVNRRLIDVIIQEKDASQFASTLTVYAIGLICVTLLAGFTKDIRKKIALDWY ABCQWLNTQIVEKYFSNRAYYKINFQSDIDNPDQRLAQEIEPIATNAISFSATFLEKSLEMLT TransporterFLVVVWSISRQIAIPLMFYTIIGNFIAAYLNQELSKINQAQLQSKADYNYALTHVRTHAESIAFFRGEKEEQNIIQRRFQEVINDTKNKINWEKGNEIFSRGYRSVIQFFPFLVLGPLYIKGEIDYGQVEQASLASFMFASALGELITEFGTSGRFSSYVERLNEFSNALETVTKQAENVSTITTIEENHFAFEHVTLETPDYEKVIVEDLSLTVQKGEGLLIVGPSGRGKSSLLRAIAGLWNAGTGRLVRPPLEEILFLPQRPYIILGTLREQLLYPLTNSEMSNTELQAVLQQVNLQNVLNRVDDFDSEKPWENILSLGEQQRLAFARLLVNSPSFTILDEATSALDLTNEGILYEQLQTRKTTFISVGHRESLFNYHQWVLELSADSSWELLSVQDYRLKKAGEMFTNASSNNSITPDITIDNGSEPEIVYSLEGFSHQEMKLLTDLSLSSIRSKASRGKVITAKDGFTYLYDKNPQ ILKWLR SEQID ATGACAACCCAAACAGCTTCTAGTGCCAATGCCCTTGCTTCCTTTAACCAATTTTTAAGG NO. 60GATGTAAAGGCGATCGCCCAACCCTATTGGTATCCCACTGTATCAAATAAAAGAAGCTTT ABCTCTGAGGTTATTCGTTCCTGGGGAATGCTATCACTGCTTATCTTTTTGATTGTGGGATTA TransporterGTCGCCGTCACGGCTTTTAATAGTTTTGTTAATCGTCGTTTAATTGATGTCATTATTCAA Nucl acidGAAAAAGATGCGTCTCAATTTGCCAGTACATTAACTGTCTATGCGATCGGATTAATCTGTGTAACGCTGCTGGCAGGGTTCACTAAAGATATTCGCAAAAAAATTGCCCTAGATTGGTATCAATGGTTAAACACCCAGATTGTAGAGAAATATTTTAGTAATCGTGCCTATTATAAAATTAACTTTCAATCTGACATTGATAACCCCGATCAACGTCTAGCCCAGGAAATTGAACCGATCGCCACAAACGCCATTAGTTTCTCGGCCACTTTTTTGGAAAAAAGTTTGGAAATGCTAACTTTTTTAGTGGTAGTTTGGTCAATTTCTCGACAGATTGCTATTCCGCTAATGTTTTACACGATTATCGGTAATTTTATTGCCGCCTATCTAAATCAAGAATTAAGCAAGATCAATCAGGCACAACTGCAATCAAAAGCAGATTATAACTATGCCTTAACCCATGTTCGGACTCATGCGGAATCTATTGCTTTTTTTCGGGGAGAAAAAGAGGAACAAAATATTATTCAGCGACGTTTTCAGGAAGTTATCAATGATACGAAAAATAAAATTAACTGGGAAAAAGGGAATGAAATTTTTAGTCGGGGCTATCGTTCCGTCATTCAGTTTTTTCCTTTTTTAGTCCTTGGCCCTTTGTATATTAAAGGAGAAATTGATTATGGACAAGTTGAGCAAGCTTCATTAGCTAGTTTTATGTTTGCATCGGCCCTGGGAGAATTAATTACAGAATTTGGTACTTCAGGACGTTTTTCTAGTTATGTAGAACGTTTAAATGAATTTTCTAATGCCTTAGAAACTGTGACTAAACAAGCCGAGAATGTCAGCACAATTACAACCATAGAAGAAAATCATTTTGCCTTTGAACACGTCACCCTAGAAACCCCTGACTATGAAAAGGTGATTGTTGAGGATTTATCTCTTACTGTTCAAAAAGGTGAAGGATTATTGATTGTCGGGCCCAGTGGTCGAGGTAAAAGTTCTTTATTAAGGGCGATCGCCGGTTTATGGAATGCTGGCACTGGGCGTTTAGTGCGTCCTCCCCTAGAAGAAATTCTCTTTTTGCCCCAACGTCCCTACATTATTTTGGGAACCTTACGCGAACAATTGCTGTATCCTCTAACCAATAGTGAGATGAGCAATACCGAACTTCAAGCAGTATTACAACAAGTCAATTTGCAAAATGTGCTAAATCGGGTGGATGACTTTGACTCCGAAAAACCCTGGGAAAACATTCTCTCCCTCGGTGAACAACAACGCCTAGCCTTTGCTCGATTGTTAGTGAATTCTCCGAGTTTTACCATTTTAGATGAGGCGACCAGTGCCTTAGATTTAACAAATGAGGGGATTTTATACGAGCAATTACAAACTCGCAAGACAACCTTTATTAGTGTGGGTCATCGAGAAAGTTTGTTTAATTACCATCAATGGGTTTTAGAACTTTCTGCTGACTCTAGTTGGGAACTCTTAAGCGTTCAAGATTATCGCCTTAAAAAAGCGGGAGAAATGTTTACTAATGCTTCGAGTAACAATTCCATAACACCCGATATTACTATCGATAATGGATCAGAACCAGAAATAGTCTATTCTCTTGAAGGATTTTCCCATCAGGAAATGAAACTATTAACAGACCTATCACTCTCTAGCATTCGGAGTAAAGCCAGTCGAGGGAAGGTGATTACAGCCAAGGATGGTTTTACCTACCTTTATGACAAAAATCCTCAGATATTAAAGTGGCTCAGAACTTAA

In one embodiment the entire gene cluster is transformed and expressedin a heterologous system. SEQ ID NO. 61 encompasses the genes of saidcluster.

1-27260 ATGACTATTAACTATGGTGATCTGCAAGAACCCTTTAATAAATTCTCAACCCTAGTTGAAMicroginin- TTACTCCGTTATCGGGCAAGCAGTCAACCGGAACGCCTCGCCTATATTTTTCTGCGAGACCluster GGAGAAATCGAAGAAGCTCGTTTAACCTATGGGGAACTGGATCAAAAGGCTAGGGCGATC1-1743 GCCGCTTATCTACAATCCTTAGAAGCCGAGGGCGAAAGGGGTTTACTGCTCTATCCCCCAAdenylation-GGACTAGATTTTATTTCAGCTTTTTTTGGTTGTTTATATGCGGGAGTCGTTGCCATTCCC Protein(A*) GCCTATCCACCCCGACGGAATCAAAACCTTTTGCGTTTACAGGCGATTATTGCCGATTCT1892-2158 CAAGCCCGATTTACCTTCACCAATGCCGCTCTATTTCCCAGTTTAAAAAACCAATGGGCTAcyl-Carrier-AAAGACCCTGAATTAGGAGCAATGGAATGGATTGTTACCGATGAAATTGACCATCACCTC Protein(ACP) AGGGAGGATTGGCTAGAACCAACCCTCGAAAAAAACAGTCTCGCTTTTCTACAATACACC2204-3016 TCTGGTTCAACGGGAACTCCAAAGGGAGTAATGGTCAGTCACCATAATTTGTTGATTAATMethyltransferaseTCAGCCGATTTAGATCGTGGTTGGGGCCATGATCAAGATAGCGTAATGGTCACTTGGCTA (MT)CCGACCTTCCATGATATGGGTCTGATTTATGGGGTTATTCAGCCTTTGTACAAAGGATTT 3464-13123CTTTGTTACATGATGTCCCCTGCCAGCTTTATGGAACGACCGTTACGTTGGTTACAGGCC PKS/NRPS(KS-AT- CTTTCTGATAAAAAAGCAACCCATAGTGCGGCCCCCAACTTTGCCTACGATCTTTGTGTGACP-AMT-MO-C-A-T)CGGAAAATTCCCCCTGAAAAACGGGCTACGTTAGACTTAAGCCATTGGTGCATGGCCTTA 13120-17832AATGGGGCCGAACCCGTCAGAGCGGAGGTACTTAAAAAGTTTGCGGAGGCTTTTCAAGTT NRPS 2(C-A-Mt-T) TCTGGTTTCAAAGCCACAGCCCTTTGTCCTGGCTACGGTTTAGCAGAAGCCACCCTGAAA17836-25194 GTTACGGCGGTTAGTTATGACAGTCCCCCTTACTTTTATCCCGTTCAGGCTAATGCTTTANRPS 3 (C-A-T-C-GAAAAAAATAAGATTGTGGGAGCCACTGAAACCGATACCAATGTGCAGACCCTCGTGGGC A-T)TGCGGCTGGACAACGATTGATACTCAAATCGTCATTGTCAATCCTGAAACCCTGAAACCT 25257-27260TGCTCCCCTGAAATTGTCGGCGAAATTTGGGTATCAGGTTCAACAATCGCCCAAGGCTATABC-TransporterTGGGGAAAACCTCAAGAGACTCAGGAAACCTTTCAAGCTTATTTGGCAGATACAGGAGCC (ABC)GGGCCTTTTCTGCGAACAGGAGACTTGGGCTTCATTAAAGATGGTGAATTGTTTATCACAGGTCGGCTCAAGGAAATTATTCTGATTCGAGGACGCAATAATTATCCCCAGGATATTGAATTAACCGTCCAAAATAGTCATCCCGCTCTGCGTCCCAGTTGTGGGGCTGCTTTTACCGTTGAAAATAAGGGCGAAGAAAAGCTCGTGGTCGTTCAGGAAGTGGAGCGCACCTGGCTCCGTAAGGTAGATATAGATGAGGTAAAAAGAGCCATTCGTAAAGCTGTTGTCCAGGAATATGATTTACAGGTTTATGCGATCGCGCTGATCAGGACTGGCAGTTTACCAAAAACCTCTAGCGGTAAAATTCAGCGTCGTAGCTGTCGGGCCAAATTTTTAGAGGGAAGCCTGGAAATTTTGGGCTAAGAAAATTTCTCGATCGGCACTTAATGTGTTAAATTCGTATGTCGATTGAAACTTCGACCAATTCTTTCTCTCCCCTTAAGTCCATGTCTCTGGATTTGAAAATTCCTTAAACTTTAACTACATTTCTCAAGAAAGCAAATTGAATCTAATGTCCACAGAAATCCCAAACGACAAAAAACAACCGACCCTAACGAAAATTCAAAACTGGTTAGTGGCTTACATGACAGAGATGATGGAAGTGGACGAAGATGAGATTGATCTGAGCGTTCCCTTTGATGAATATGGTCTCGATTCTTCTATGGCAGTTGCTTTGATCGCTGATCTAGAGGATTGGTTACGACGAGATTTACATCGCACCCTGATCTACGATTATCCAACTCTAGAAAAGTTGGCTAAACAGGTTAGTGAACCCTGACATTTTTATAAAGTTTGTGCTTAAAAATTTTGAGGAAGTTCTAAAATGACAAATTATGGCAAATCTATGTCTCATTACTATGATCTAGTGGTAGGACATAAAGGTTATAACAAAGATTACGCCACTGAAGTAGAATTCATTCACAATTTAGTTGAGACTTACACAACTGAAGCCAAATCTATCCTATACTTGGGCTGTGGTACGGGTTATCATGCCGCTCTTTTAGCACAGAAAGGGTATTCTGTACATGGTGTTGATCTCAGTGCTGAAATGTTAGAGCAGGCTAAAACTCGCATTGAAGATGAAACAATAGCTTCTAATCTGAGTTTTTCTCAAGGAAATATTTGTGAAATCCGTTTAAATCGTCAGTTTAATGTTGTTCTTGCTCTATTTCATGTGGTTAACTATCAAACGACCAATCAAAATTTACTGGCAACGTTTGCAACGGTTAAAAACCATTTAAAAGCTGGGGGGATTTTTATTTGTGATGTGTCCTATGGGTCTTACGTACTGGGGGAATTTAAGAGTCGGCCTACGGCATCAATATTGCGTTTAGAGGATAATTCCAATGGTAACGAAGTAACCTATATTAGTGAACTAAATTTTTTAACCCATGAAAATATAGTGGAAGTTACTCACAATTTATGGGTAACAAATCAAGAAAATCAACTTCTAGAGAATTCACGGGAAACACATCTTCAGCGCTATCTTTTCAAGCCTGAAGTTGAATTGTTGGCTGATGCTTGTGAACTAACTGTTCTTGATGCGATGCCCTGGCTTGAACAACGTCCTTTGACAAACATTCCTTGTCCTTCAGTTTGTTTTGTTATTGGGCATAAAACAACCCATTCAGCTTAAATTCTGCTAAAAAAAATCCAACTTACCTTATTCTCTGAAACCACACAAGCCATGAATACAATTCAAGATGCCAAGACCGAAAATTACTCAATCTTAAATCAGTCAATTCCAAGACCTCTCAAACTGAGTAATATCCTATTACGATAAGATTTTGCGTTCTCCTTTGTTTGGAATGTCAGCAGAGGAGTCTCTATATTGGCTAGAGAAATGTTTATGTCAAGAGCATCAGGGCTTCGATGTACAAGTTAAGTATCATCAAAAAATGCTGAAGAATATGTTACGTTTGACCGATAGTTTGGATTATCTATGGCCAGTTAACCGTGAAATGCGGCTCATGAAAGCTGGGGGGTCAATTGAACGGGCGATCACCAATAACATTAAAGCTTTTCTTCAATTTAAAGAAACTGTAACCGTATTAAATTAGAAAAACCGCAGTGAGGAATTTGAATGGAACCCATCGCAATTATTGGTCTTGCTTGCCGCTTTCCAGGGGCTGACAATCCAGAAGCTTTCTGGCAACTCATGCGAAATGGGGTGGATGCGATCGCCGATATTCCTCCTGAACGTTGGGATATTGAGCGTTTCTACGATCCCACACCTGCCACTGCCAAGAAGATGTATAGTCGCCAGGGCGGTTTTCTAAAAAATGTCGATCAATTTGACCCTCAATTTTTCCGAATTTCTCCCCTAGAAGCCACCTATCTAGATCCTCAACAAAGACTGCTACTGGAAGTCACCTGGGAAGCCTTAGAAAATGCTGCCATTGTGCCTGAAACCTTAGCTGGTAGCCAATCAGGGGTTTTTATTGGTATCAGTGATGTGGATTATCATCGTTTGGCTTATCAAAGTCCTACTAACTTGACCGCCTATGTGGGTACAGGCAACAGCACCAGTATTGCGGCTAACCGTTTATCATATCTGTTTGATTTGCGTGGCCCCAGTTTGGCCGTAGATACCGCTTGCTCTTCTTCCCTCGTCGCCGTTCACTTGGCCTGTCAGAGTTTGCAAAGTCAAGAATCGAACCTCTGCTTAGTGGGGGGAGTTAATCTCATTTTGTCGCCAGAGACAACCGTTGTTTTTTCCCAAGCGAGAATGATCGCCCCCGACAGTCGTTGTAAAACCTTTGACGCGAGGGCCGATGGTTATGTGCGCTCGGAAGGCTGTGGAGTAGTCGTACTTAAACGTCTTAGGGATGCCATTCAGGACGGCGATCGCATTTTAGCAGTGATTGAAGGTTCCGCGGTGAATCAGGATGGTTTAAGTAATGGACTCACGGCCCCTAATGGCCCTGCTCAACAGGCGGTGATTCGTCAGGCCCTGGCAAATGCCCAGGTAAAACCGGCCCAGATTAGCTATGTCGAAGCCCATGGCACGGGGACAGAATTGGGGGATCCGATCGAAGTTAAATCTCTGAAAGCGGTTTTGGGTGAAAAGCGATCGCTCGATCAAACCTGTTGGCTCGGTTCTGTGAAAACCAACATTGGTCATTTAGAAGCGGCGGCGGGAATGGCGGGTCTGATTAAAGTCGTTCTCTGCCTACAACACCAAGAAATTCCCCCTAATCTCCACTTTCAAACCCTTAATCCCTATATTTCCCTAGCTGACACAGCTTTTGCGATTCCCACTCAGGCTCAACCCTGGCGGACCAAACCCCCTAAGTCTGGTGAAAACGGTGTCGAACGACGTTTAGCAGGACTCAGTTCCTTTGGGTTTGGGGGGACAAATTCCCATGTGATTCTCAGCGAAGCCCCTGTCACCGTTAAAAACAATCAACAAAATGGGCAGAAGTTGATAGAACGTCCCTGGCATTTGCTGACTTTATCTGCCAAGAATGAAGAAGCCTTAAAAGCCTTAGTCCATTGTTATCAAAAGTATTTAGCTGATCATCATGAAATTCCTCTCGCTGATGTTTGTTTTACGGCCAATAGTCGGCGATCGCACTTTAATCATCGTTTAGGAGTAGTGGCTAGAGATCGCTTAGAAATGTTGCAGAAGTTAGAGAACTTTAGTAACCAAGAAAGGATGAGAGAACCGAAGAGTATTAACAAAAAGAAAAAACCTAAAATTGTTTTTCTATTTGCCGGTCAAGGTTCTCAATATGTAGGTATGGGTCGTCAACTGTACGAAACCCAACCCATCTTTCGCCAAACCTTGGATCGCTGTGCTGAAATCCTGCGACCCCATTTAGATCAACCCCTCTTAGAAATTCTTTATCCTGCTGACCCAGAAGCCGAAACAGCGAGTTTTTACCTAGAGCAGACTGCCTATACCCAACCCACTTTATTCGCATTCGAGTATGCCCTAGCACAGTTATGGCGTTCCTGGGGAATAGAACCGGCGGCAGTAATTGGTCACAGTGTCGGTGAATATGTGGCGGCCACCGTTGCCGGAGCCTTAAGTCTAGAAGAAGGATTAACGCTAATTGCCAAACGGGCAAAACTGATGCAGTCTCTCCCCAAGAATGGGACAATGATCGCCGTTTTTGCCGCAGAAGAGCGGGTTAAAGCTGTTATTGAGCCTTATAGGACTGATGTAGCGATCGCTGCTGTTAATGGACCAGAAAATTTTGTTATTTCAGGAAAAGCGCCGATTATTGCTGAGATTATCATTCATTTAACGGCAGCAGGAATAGAAGTTCGTCCTCTCAAAGTTTCCCATGCTTTTCACTCGCACCTGTTGGAGCCAATTTTAGATTCCTTAGAACAGGAAGCTGCTGCTATTTCCTACCAACCCCTGCAAATTCCCTTAGTTGCTAATTTAACGGGGGAAGTTCTACCAGAAGGAGCAACGATTGAGGCTCGTTACTGGCGAAATCATGCACGCAACCCTGTACAATTTTATGGGAGTATCCAAACGCTGATCGAGCAGAAATTCAGTCTTTTTTTAGAAGTTAGCCCTAAACCGACTTTATCTCGATTGGGTCAACAATGTTGTCCAGAAAGATCGACCACTTGGCTATTTTCCCTCGCCCCTCCTCAAGAAGAAGAACAAAGCCTACTAAATAGTTTGGCGATTCTCTATGATTCCCAAGGAGCCGAAATAAACTGGGAAGGGTTTAATCAAAATTATCCCCACCATTTACTGGCTCTACCGACCTATCCTTTTCAACGTCAACGCTATTGGCTTGAAACCGGTAAACCGACTTCTGAAGAAACAACCATGACGACCAATGCCACTAATGTCCAAGCTATCTCCAGCCATCAAAAACAACAGGAGATTCTAATCACATTGCAAACCCTAGTGGGAAATTTACTGCAATTGTCCCCTGCTGATGTCAATGTTCATACACCTTTCCTGGAGATGGGGGCAGATTCCATTGTCATGGTTGAGGCGGTCAGACGGATTGAGAATACCTATAACGTTAAAATTGCTATGCGTCAGTTATTTGAGGAGTTATCTACTTTAGATGCTTTAGCTACTTATTTAGCTCAAAATCCGGCTACTGATTGCCAAACTGCTCAAATTAATACCGAGGTGTTTTCTGCGCCCATTGCCTGCTCAAATAACCGATCGCCCAATGTCGTGCTGAGTTCTAATACCAACGGCTTTCAACGTCAAACAGCTTCTCCAGGTTTTTCGGCGATCGCCCCCCTTGCAGGAATGGGAGGAGCAGGGGAAATGGGAGGAGTTGAAGTGCCTCAAGTTTCTGTGCCACAAACCAGTGCGGTAACAGCCTCAGGTTCAACCGTTTCTAGTTCTGCCCTGGAAAACATTATGGGTCAACAGTTACAACTGATGGCCAAACAGTTAGAAGTCTTGCAAACGGCCAATTTTGCCCCGACGACTCCCCGAACCACAGAAAATTCCCCATCTTCCGTCAGTCAAAATAGGTCAAACGGACTTACACAACAGTTAATTCCCCCCCAGCAATTAGCGGCGAACCTAGAGCCAATAGCCAGTCGCACCCGTCAAACCAGCAATCAAGCTTCTGCTCCTAAACCGACAGTAACAGCCACTCCCTGGGGGCCGAAAAAACCACCCACAGGTGGATTCACTCCCCAACAACAGCAACATCTAGAGGCATTAATTGCTCGCTTTACGGAACGTACCAAAACCTCTAAGCAAATTGTGCAAAGCGATCGCCTGCGTTTAGCAGATAGTCGAGCCTCGGTCGGATTCCGTATGTCTATTAAAGAGATGCTTTATCCCATTGTGGCCCAACGTTCTCAAGGATCAAGAATTTGGGATGTGGACGGTAATGAATATATTGATATGACGATGGGGCAAGGGGTAACGCTGTTTGGGCATCAACCAGACTTCATTATGTCGGCCCTACAAAGCCAACTCACTGAAGGCATTCATCTCAATCCGCGATCGCCAATTGTGGGAGAAGTGGCCGCCTTAATTTGTGAACTAACAGGAGCCGAACGAGCTTGTTTTTGCAACTCTGGAACCGAAGCCGTAATGGCCGCTATTCGTATCGCCAGGGCAACAACAGGTCGGAGTAAAATTGCCCTCTTTGAAGGCTCCTATCATGGACATGCGGACGGAACCCTTTTTAGGAACCAAATTATTGATAACCAACTCCACTCTTTTCCCCTAGCTCTAGGCGTTCCCCCCAGCCTTAGTTCCGATGTGGTGGTATTGGACTATGGCAGTGCGGAAGCTCTGAACTATTTACAAACCCAGGGGCAGGATTTAGCGGCGGTCTTAGTAGAACCAATTCAAAGTGGCAATCCTCTACTCCAACCCCAACAATTTCTCCAAAGTCTGCGACAAATTACCAGTCAAATGGGCATTGCCCTGATTTTTGATGAAATGATTACGGGTTTTCGATCGCACCCAGGGGGAGCGCAAGCTTTATTTGGAGTACAGGCGGATATTGCCACCTATGGCAAAGTAGTTGCGGGAGGAATGCCCATTGGAGTTATTGCAGGTAAGGCCCATTATCTGGACAGCATTGACGGGGGAATGTGGCGTTATGGCGATAAATCCTATCCTGGGGTGGACAGAACCTTTTTTGGGGGAACCTTTAATCAGCATCCGTTAGCAATGGTAGCGGCTAGGGCTGTCCTGACCCATTTAAAGGAGCAGGGGCCAGGTCTGCAACAACAATTAACTGAACGCACTGCGGCCTTAGCCGATACACTGAATCATTATTTTCAAGCCGAAGAAGTTCCTATTAAAATCGAACAGTTTAGTTCTTTCTTCCGGTTTGCCCTCTCTGGCAATTTGGATTTACTTTTCTATCACATGGTAGAAAAAGGTATTTATGTCTGGGAATGGCGTAAACATTTTCTTTCAACCGCCCATACGGAAGCCGATCTTGCCCAATTTGTCCAAGCGGTTAAGGATAGCATCACAGAATTGCGTCAGGGAGGTTTTATCCCCGCAAAAAAGCCTTCCTGGCCAGTGCCAACGCCTCAAATTGATCCCCCCCTAACCCCCCTTGATAAGGGGATTGATCCCCCCCTAACCCCCCTTGATAAGGGGATTGATCCCCCCCTAACCCCCCTTGATAAGGGGGGAGATGTTGATGTCGCGCTTGATAAGGGAGGAAATTCTCATTCTGTTAGGGACAGTAAGTTAGGGAAAGGGAGCGGGTCTCAAGACCAAAAAACGATACAGTTTAGCCTCTACTACTTTGGTAGCTATGAAGCGGAATTTAACCCGAATAAATATAACTTACTGTTTGAAGGAGCTAAATTTGGCGATCGCGCTGGTTTTACGGCCCTTTGGATTCCTGAACGTCATTTCCACGCTTTTGGTGGTTTTTCTCCCAATCCTTCGGTTTTGGCGGCGGCTTTAGCACGGGAAACCAAACAGATTCAACTGCGATCAGGCAGTGTGGTTTTACCGCTACATAATTCCATCCGAGTCGCCGAAGAATGGGCAGTGGTGGACAATCTTTCCCAGGGCCGCGTTGGTATTGCTTTTGCATCGGGTTGGCATCCCCAGGATTTTGTCTTGGCTCCCCAGTCCTTTGGCCAACATCGGGAATTGATGTTCCAAGAAATTGAAACCGTCCAGAAACTTTGGCGAGGGGAAGCGATCACCGTGCCAGACGGAAAGGGTCAAAGGGTAGAGGTTAAAACCTATCCCCAACCGATGCAGTCCCAGTTACCCAGCTGGATTACTATTGTCAATAATCCCGATACCTATATCAGAGCAGGGGCGATCGGTGCTAATATCCTTACCAATCTGATGGGGCAAAGCGTGGAAGATTTAGCCCGTAATATTGCGCTATATCGTCAATCTTTGGCAGAGCATGGTTATGATCCCGCGTCGGGAACGGTGACAGTTCTCCTGCATACTTTTGTTGGCAAGGATTTAGAACAAGTTCGAGAACAGGCTCGCCAACCCTTTGGGCAATACCTCACCTCCTCTGTCGGACTCTTGCAGAACATGGTCAAGAGCCAGGGCATGAAAGTGGATTTTGAACAATTAAGAGACGAAGATCGGGACTTTCTCCTCGCTTCTGCCTATAAACGCTATACAGAAACCAGTGCTTTAATTGGCACACCCGAATCCTGTCGTCAAATTATTGATCATTTGCAGTCCATCGGTGTGGATGAAGTGGCTTGTTTTATTGATTTTGGGGTAGATGAACAAACAGTTTTGGCCAATTTACCCTATCTCCAGTCCCTAAAAGACTTATATCAACCTCATCTCCCCCCTTATCAAGGGGGGTTAGGGGGGGATCAATCCCCTTATCAAGGGGGGTTAGGGGGGGATCAATCCCCTTATCAAGGGGGGTTAGGGGGTGATCAATCCCCTTATCAAGGGGGGTTAGGGGGTGATCAATCCCCTTATCAAGGGGGGTTAGGGGGGGATCAATCCCCTTATCAAGGAGAGTTAGGGGGGGATCAATCCCCTTATCAAGGGGGGTTAGGGGGGGATCAAGTCCCTCTCACCGAAGCCCAACGACAACTGTGGATTTTGGCTCAATTAGGAGACAACGGCTCTGTGGCCTATAACCAATCAGTGACATTGCAATTAAGTGGCCCATTAAATCCCGTCGCAATGAATCAAGCTATTCAACAAATCAGCGATCGCCATGAAGCGTTACGAACCAAAATTAATGCCCAGGGAGATAGTCAAGAAATCCTGCCCCAGGTCGAAATTAACTGCCCTATCTTAGACTTCAGTCTTGACCAAGCTTCGGCCCAACAGCAAGCAGAACAATGGTTAAAGGAAGAAAGTGAAAAACCCTTTGATTTGAGCCAGGGTTCTCTCGTGCGTTGGCATCTACTCAAATTAGAACCAGAATTACATTTGTTAGTATTAACGGCCCATCACATTATCAGTGACGGTTGGTCAATGGGGGTAATCCTTCGGGAATTAGGAGAGTTATATTCAGCCAAATGTCAGGGTGTTACGGCTAATCTTAAAACCCCAAAACAGTTTCGAGAATTGATTGAATGGCAAAGCCAGCCAAGCCAAGGGGAAGAACTGAAAAAACAGCAAGCCTATTGGTTAGCAACCCTTGCCGATCCCCCTGTTTTGAATTTACCCACTGACAAACCTCGTCCAGCTTTACCCAGTTACCAAGCTAATCGTCGAAGTCTAACTTTAGATAGCCAATTTACAGAAAAACTAAAGCAATTTAGTCGTAAACAGGGCTGTACCTTGCTGATGACCCTGTTATCGGTTTATAACATTCTCGTTCATCGTTTGACGGGACAGGATGATATTCTGGTGGGTCTGCCAGCCTCTGGACGGGGGCTTTTAGATAGTGAAGGTATGGTGGGTTATTGCACCCATTTTTTACCAATTCGCAGTCAATTAGCAGGTAATCCCACTTTTGCTGAATATCTCAAACAAATGCGGGGGGTTTTGTTGTCGGCTTATGAACATCAGGACTATCCCTTTGCTCTTTTGCTCAATCAGTTAGATTTACCGCGTAATACCAGTCGCTCTCCTTTAATTGATGTCAGTTTCAATTTAGAACCAGTTATTAACCTACCCAAAATGAAAGGATTAGAGATTAGTTTGTTGCCTCAAAGTGTAAGTTTTAAGGATCGAGATTTGCATTGGAATGTGACAGAAATGGGTGGAGAAGCTCTGATTGATTGTGACTACAATACAGACTTATTTAAAGATGAAACGATTCAGCGTTGGTTAGGCCATTTTCAAACCTTACTTGAGGCAGTTATTAATGATTCGCAACAAAATCTGCGGGAATTACCCTTATTAAGTTCTGCTGAACGACAACAGTTATTAGTGGATTGGAATCAAACCAAGACCGACTATCCCCAAGATCAGTGTATTCATCAATTATTTGAAGCGCAAGTTGAACGGACTCCCGATGCGATTGCGGTGGTATTTGAAACTCAACAATTAACTTACAGTGAATTAAATTGTCGAGCCAATCAGTTAGCACATTATTTACAAAAATTAGGAGTTGGGCCAGAGGTCTTAGTCGGTATTTTGGTCGAACGTTCTTTAGAAATGATTGTCGGATTGTTAGGGATTCTCAAGGCTGGGGGAGCCTATGTACCTCTTGATCCTGACTATCCCCCTGAACGTCTTCAATTTATGTTAGAAGATAGTCAATTTTTTCTCCTCTTAACCCAACAGCATTTACTGGAATCTTTTGCTCAGTCTTCAGAAACGGCTACTCCCAAGATTATTTGTTTGGATAGCGACTACCAAATTATTTCCCAGGCAAAGAATATTAATCCCGAAAATTCAGTCACAACGAGTAATCTTGCCTATGTAATTTATACCTCTGGTTCGACAGGTAAACCGAAGGGCGTGATGAATAATCATGTTGCTATTAGTAATAAATTGTTATGGGTACAAGACACTTATCCTCTAACCACAGAAGACTGTATTTTACAAAAAACTCCCTTTAGTTTTGATGTTTCAGTGTGGGAATTATTCTGGCCCCTACTAAACGGAGCGCGTTTGGTTTTTGCCAAGCCGAATGGCCATAAAGATGCCAGTTACTTAGTCAATCTGATTCAAGAGCAACAAGTAACAACGCTACATTTTGTGTCTTCTATGCTACAGCTTTTTCTGACAGAAAAAGACGTAGAAAAATGTAATAGTCTTAAACGAGTCATTTGTAGTGGTGAAGCCCTTTCTTTAGAGCTTCAAGAACGTTTTTTTGCTCGTTTAGTCTGTGAATTACACAATCTTTATGGACCGACAGAAGCCGCTATTCATGTCACATTTTGGCAATGTCAATCAGATAGCAATTTGAAAACAGTACCCATTGGTCGGCCGATCGCTAATATCCAAATTTACATTTTAGACTCTCATCTTCAGCCAGTACCTATTGGAGTAATCGGAGAATTGCACATTGGTGGGGTTGGTTTGGCGCGGGGTTATTTAAACAGGCCTGAGTTAACGGCGGAGAAATTTATTGCAAATCCGTTTGCTTCCCTTGATCCCCCCCTAACCCCCCTTGATAAGGGGGGAGATGAGAGCTATAAAACTTTTAAAAAGGGGGGAGAGCAACCATCAAGATTGTATAAAACGGGAGATTTAGCTCGTTATTTACCCGATGGCAAGATTGAGTATCTAGGGCGCATTGATAATCAGGTAAAAATTCGCGGTTTCCGGATTGAATTGGGGGAAATTGAAGCGGTTTTGCTATCCCATCCCCAGGTACGAGAAGCGGTCGTTTTGGTGAGCGAAAGCGATCGCTCTGAAAATCGGGCTTTGGTCGCTTATATTGTCCCTAATGATCCTGCTTGTACGACTCAATCATTACGAGAGTTTGTTAAACGGCAGCTTCCTGACTATATGATCCCAGCTTATTGGCTGATCCTTGACAATTTACCGTTAACCAGCAATGGCAAAATTGATCGTCGGGCTTTACCGTTACCTAATCCAGAGTTAAATCGTTCGATAGACTATGTGGCTCCCAAAAATCCTACCCAGGAGGCGATCGCCGCTATTTTTGGTCAAGTTTTAAAACTGGAAAAAGTGGGAATTTATGATAACTTTTTTGAGATCGGCGGTAATTCTTTGCAAGCCACTCAAGTTATTTCACGCTTACGAGAAAGTTTTGCCCTAGAGTTGCCCTTGCGTCGCCTGTTTGAACAACCGACTGTGGCGGATTTGGCTTTAGCCGTAACGGACATTCATGCCACTTTACAAAAATTACAAACCCCTATTGATGATTTATCAGGCGATCGCGAGGAGATTGAACTATGAAATCTATTGAAACCTTTTTGTCAGATTTAGCCAATCAAGATATTAAACTCTGGATGGACGGCGATCGCCTGCGTTGTAATGCACCCCAGGGCCTATTAACCCCAGAGATTCAAACAGAACTGAAAAACCGTAAAGCAGAAATCATTCACTTTCTCAATCAACTGGGTTCAGAGGAGCAAATTAATCCTAGAACGATTCTTCCCATTCCTCGTGATGGCCAATTACCCCTCTCCTTTGCCCAGTCGCGACTCTGGTTCTTGTATCAATTAGAAGGAGCCACGGGAACCTATAACATGACAGGGGCCTTGAGTTTAAGCGGGCCTCTTCAGGTCGAAGCCCTCAAACAAGCCCTAAGAACTATCATTCAACGCCATGAGCCATTGCGTACCAGTTTCCAATCGGTTGACGGGGTTCCAGTGCAGGTGATTAATCCCTATCCTGTTTGGGAATTAGCGATGGTTGATTTGACAGGAAAGGAGACAGAAGCAGAAAAATTGGCCTATCAGGAATCCCAAACCCCGTTTGATTTGACCAATAGTCCTTTGTTGAGGGTAACGCTCCTCAAATTACAGCCAGAAAAGCATATTTTATTAATTAATATGCACCATATTATTTCCGATGGCTGGTCAATCGGTGTTTTTGTTCGTGAATTGTCCCATCTCTATAGGGCTTTTGTGGCGGGTAAAGAACCAACTTTACCGATTTTACCAATTCAGTATGCGGATTTTGCCGTTTGGCAGCGAGAGTGGTTACAGGGTAAGGTTTTAGCGGCTCAATTGGAATATTGGAAGCGACAATTGGCAGATGCTCCTCCTCTGCTGGAACTGCCCACTGATCGCCCTCGTCCCGCAATCCAAACCTTTCAAGGCAAGACAGAAAGATTTGAGCTAGATAGGAAACTGACCCAAGAATTAAAGGCATTAAGTCAACAGTCGGGTTGTACTTTATTTATGACTTTGTTGGCCGCTTTTGGGGTGGTTTTATCCCGTTATAGTGGCCAGACTGATATCGTCATTGGTTCGGCGATCGCCAACCGTAATCGCCAAGACATTGAGGGGTTAATTGGCTTTTTTGTTAACACTTTGGCGTTGAGGTTAGATTTATCAGAAAAACCCAGCTTTGCCGCTTTTTTAAAACAAGTACAGGAAGTCACTCAGGATGCCTATGAGCATCAAGACTTGCCCTTTGAAATGTTAGTGGAAGAATTACAACTAGAGCGCAAATTAGACCGAAATCCTTTGGTACAGGTGATGTTTGCCCTACAAAATGCGGCCAATGAAACCTGGAATTTACCTGGGTTGACCATTGAAGAAATGTCTTGGGAACTTGAACCTGCCCGTTTTGACCTAGAGGTTCATTTATCAGAAGTTAACGCCGGCATAGCTGGATTCTGTTGCTACACCATTGATCTATTTGATGATGCAACGATCGCCCGTCTATTGGAACATTTTCAGAATCTTCTCAGGGCAATTATTGTTAATCCTCAAGAATCGGTAAGTTTATTACCCTTGTTGTCAGAACAGGAAGAAAAGCAACTTTTAGTTGATTGGAATCAAACCCAAGCCGATTATCCCCAAGATAAGCTTGTCCATCAGTTATTTGAAGTTCAAGCAGCCAGTCAGCCAGAAGCGATCGCTCTAATCTTTGAAAATCAGGTTTTGACCTATGGAGAATTAAACCATCGCGCCAATCAATTAGCTCACTATCTTCAGTCGTTAGGAGTCACCAAAGAACAAATCGTCGGGGTTTATCTGGAACGTTCCCTTGAAATGGCGATCGGATTTTTAGGTATTCTCAAAGCAGGAGCCGCCTATCTCCCCATTGATCCTGAATATCCCTCAGTACGCACCCAATTTATTCTCGAAGATACCCAACTTTCGCTTCTCTTAACTCAGGCAGAACTGGCAGAAAAACTGCCCCAGACTCAAAACAAAATTATCTGTCTAGATCGGGACTGGCCAGAAATTACCTCCCAACCCCAGACAAACCTAGACCTAAAGATAGAACCTAATAACCTAGCCTATTGCATCTATACTTCTGGTTCCACAGGACAACCCAAAGGAGTACTGATTTCCCATCAAGCCCTACTCAACTTAATTTTCTGGCATCAACAAGCGTTTGAGATTGGCCCCTTACATAAAGCGACCCAAGTGGCAGGCATTGCTTTCGATGCAACGGTTTGGGAATTGTGGCCCTATCTGACCACAGGAGCCTGTATTAATCTGGTTCCCCAAAATATTCTGCTCTCACCGACGGATTTACGGGATTGGTTGCTTAACCGAGAAATTACCATGAGTTTTGTGCCAACTCCTTTAGCTGAAAAATTATTATCCTTGGATTGGCCTAACCATTCTTGTCTAAAAACCCTGTTACTGGGAGGTGACAAACTTCATTTTTATCCTGCTGCGTCCCTTCCCTTTCAGGTCATTAACAACTATGGCCCAACGGAAAATACAGTGGTTGCGACCTCTGGACTGGTCAAATCATCTTCATCTCATCACTTTGGAACTCCGACTATTGGTCGTCCCATTGCCAACGTCCAAATCTATTTATTAGACCAAAACCTACAACCTGTCCCCATTGGTGTACCAGGAGAATTACATTTAGGTGGGGCGGGTTTAGCGCAGGGCTATCTCAATCGTCCTGAGTTAACGGCTGAAAAATTTATTGCCAATCCCTTTGATCCCCCCCTAACCCCCCTTGATAAGGGGGGAGAAGAACCCTCAAAACTCTATAAAACGGGAGACTTAGCCCGTTATTTACCCGATGGCAATGTAGAATTTTTGGGACGTATTGACAATCAGGTAAAAATTCGGGGTTTTCGCATCGAAACTGGGGAAATCGAAGCCGTTTTAAGTCAATATTTCCTATTAGCTGAAAGTGTAGTCGTTGCCAAGGAAGATAATACTGGGGATAAACGCCTCGTGGCTTATTTGGTTCCCGCCTTGCAAAATGAGGCCCTACCAGAGCAATTAGCCCAATGGCAAAGTGAATACATCAGTGATTGGCAAAGTCTCTATGAAAGAACCTATAGTCAAGGGCAAGACAGCCTAGCTGATCTCACTTTTAATATCACGGGTTGGAATAGCAGTTATACTCGTCAACCCCTTCCTGCTTCAGAAATGCGAGAGTGGGTCGAAAACACTGTTAGTCGCATCTTGGCTTTCCAACCAGAACGCGGTTTAGAAATTGGTTGTGGTACAGGTTTGTTACTCTCCAGGGTAGCAAAGCATTGTCTTGAATATTGGGCAACGGATTATTCCCAAGGGGCGATCCAGTATGTTGAACGGGTTTGCAATGCCGTTGAAGGTTTAGAACAGGTTAAATTACGCTGTCAAATGGCAGATAATTTTGAAGGTATTGCCCTACATCAATTTGATACCGTCGTCTTAAATTCGATTATTCAGTATTTTCCCAGTGTGGATTATCTGTTACAGGTGCTTGAAGGGGCGATCAACGTCATTGGCGAGCGAGGTCAGATTTTTGTCGGGGATGTGCGGAGTTTACCCCTATTAGAGCCATATCATGCGGCTGTGCAATTAGCCCAAGCTTCTGACTCGAAAACTGTTGAACAATGGCAACAACAGGTGCGTCAAAGTGTAGCAGGTGAAGAAGAACTGGTCATTGATCCCACATTGTTCCTGGCTTTAAAACAACATTTTCCGCAAATTAGCTGGGTAGAAATTCAACCGAAACGGGGTGTGGCTCACAATGAGTTAACTCAATTTCGCTATGATGTCACTCTCCATTTAGAGACTATCAATAATCAAGCATTATTGAGCGGCAATCCAACGGTAATTACCTGGTTAAATTGGCAACTTGACCAACTGTCTTTAACACAAATTAAAGATAAATTATTAACAGACAAACCTGAATTGTGGGGAATTCGTGGTATTCCTAATCAGCGAGTTGAAGAGGCTCTAAAAATTTGGGAATGGGTGGAAAATGCCCCTGATGTTGAAACGGTTGAACAACTCAAAAAACTTCTCAAACAACAAGTAGATACTGGTATTAATCCTGAACAGGTTTGGCAATTAGCTGAGTCTCTCGGTTACACCGCTCACCTTAGTTGGTGGGAAAGTAGTCAAGACGGTTCCTTTGATGTCATTTTTCAGCGGAATTCAGAAGCGGAGGACTCAAAAAAATTAACCCTTTCAAAACTTGCTTTCTGGGATGAAAAACCCTTTAAAATAAAGCCCTGGAGTGACTATACTAACAACCCTCTGCGCGGTAAGTTAGTCCAAAAATTAATTCCTAAAGTACGAGAATTTCTGCAAGAAAAACTACCCAGTTATATGGTTCCCCAGGCGTTTGTGCTGCTTGATTCCCTTCCTTTGACCCCCAATGGTAAGGTGGATCGTAAGGCGTTACCTTCTCCTGATGCGGCGACTCGTGATTTAGCGAACAGTTTTGTCTTACCCCGCAATCCGATTGAAGCTCAACTGACTCAAATTTGGAGTGAAGTTTTGGGACTGGAACGCATTGGCGTTAAGGACAACTTTTTTGAATTGGGAGGACATTCTCTTTTGGCTACCCAGGTTTTATCAAGAATTAATTCAGCCTTTGGACTTGATCTTTCTGTGCAAATTATGTTTGAATCACCAACGATCGCGGGCATTGCGGGTTATATTCAAGCGGTAGATTGGGTCGCCCAGGATCAAGCCGATAGCTCGTTAAATCATGAAAATACTGAGGTAGTGGAGTTCTAAGTTATGACGAAAAAGATTGTTGAATTTGTCTGTTATCTACGGGATTTAGGCATTACTTTAGAAGCTGATGAAAACCGCTTACGCTGTCAGGCTCCCGAAGGAATTTTGACCCCAGCACTCCGTCAAGAAATTGGCGATCACAAACTGGAATTATTACAATTTTTACAATGGGTCAAACAGTCTAAAAGTACCGCTCATTTGCCTATTAAACCTGTCGCTAGAGACGGTCATTTACCCCTGTCTTTTGCTCAACAACGTTTATGGTTTTTACATTATCTTTCCCCTGATAGTCGTTCCTACAATACCCTGGAAATATTGCAAATTGATGGGAATCTCAATCTGACTGTGCTAGAGCAGAGTTTGGGGGAATTAATTAACCGCCATGAAATTTTTAGAACAACATTCCCCACTGTTTCAGGGGAACCGATTCAGAAAATTGCACTTCCTAGTCGTTTTCAGTTAAAAGTTGATAATTATCAAGATTTAGACGAAAATGAACAATCAGCTAAAATTCAACAAGTAGCAGAATTGGAAGCAGGACAAGCTTTTGATTTAACGGTGGGGCCACTGATTCAGTTTAAGCTATTGCAATTGAGTCCCCAGAAGTCGGTGCTGCTGTTGAAAATGCACCATATTATCTATGATGGCTGGTCTTTTGGGATTCTGATTCGGGAATTATCGGCTCTATACGAAGCATTTTTAAAGAACTTAGCCAATCCTCTCCCTGCGTTGTCTATTCAGTATGCAGATTTTGCGGTTTGGCAACGTCAATATCTCTCAGGTGAGGTCTTAGATAAACAACTCAATTATTGGCAAGAACAGTTAGCAACAGTCTCTCCTGTTCTTACTTTACCAACGGATAGACCCCGTCCGGCGATACAAACTTTTCAGGGAGGAGTTGAGCGTTTTCAACTGGATCAAAATGTCACTCAAGGTCTTAAAAAGTTAGGTCAAGATCAGGTTGCAACCCTGTTTATGACGTTGTTGGCCGGTTTCGGCGTTTTGCTATCTCGTTATAGTGGTCAATCTGATCTGATGGTGGGTTCTCCGATCGCTAATCGTAATCAAGCAGCGATCGAACCTTTAATTGGCTTTTTTGCTAACACTTTGGCTTTAAGAATTAATTTATCAGAAAATCCCAGTTTTTTAGAATTATTAGAACAAGTTAAACAGACAACTTTAGAGGGTTATGCTCACCAAGACCTACCCTTTGAGATGTTAGTAGAAAAGCTACAACTTGACCGTGATTTGAGCAGAAATCCTTTAGTACAAGTCATGTTTGCGCTACAAAATACCTCTCAAGATACTTGGAATCTTTCGGGTTTAAGTATTGAAAGTTTATCTTTATCAGTGGAAGAAACTGTCAGATTTGATCTAGAAGTAAACTGCTGGCAAAATTCAGAAGGTTTAGCAATAGATTGGATTTACAGCAGAGATTTATTTGACACTGCAACAATTGCAAGAATGGGAGAACATTTTCAAAATTTAGTTCAGGCAATCATACTCAATCCAAAAGCTACAGTTAAAGAACTTCCTTTATTAACACCCAAGGAACGTGAGCAATTATTAATATCTTGGAATAATAGCAAGACTGATTATCCTCAAGAGCAGTGTATTTATCAATTATTTGAAGCACAAGTTGAACGGACTCCAAAGGCGATCGCAGTGGTATTTGAGGAGCAATCATTAACATACACTGAATTAAACCATCGCGCTAATCAGTTAGCCCATTATTTACAAACTTTAGGCGTGGGAGCAGAAGTCTTAGTCGGTATTTCCCTAGAACGTTCTTTAGAGATGATTATCGGCTTATTAGGGATTCTCAAGGTAGGTGGTGCTTATCTTCCTCTTGATCCAGACTATCCCACTGAGCGTCTTCAGTTGATGTTAGAAGACAGTCAAGTTCCTTTTTTGATTACCCACAGTTCTTTATTAGCAAAATTGCCTCCCTCTCAAGCAACTCTGATTTGTTTAGATCATATCCAAGAGCAGATTTCTCAATATTCTCCAGATAATCTTCAATGTCAGTTAACTCCTGCCAATTTAGCTAACGTTATTTATACCTCTGGCTCTACGGGTAAGCCTAAAGGGGTGATGGTTGAACATAAAGGTTTAGTTAACTTAGCTCTTGCTCAAATTCAATCTTTTGCAGTCAACCATAACAGTCGTGTGCTGCAATTTGCTTCTTTTAGTTTTGATGCTTGTATTTCAGAAATTTTGATGACCTTTGGTTCTGGAGCGACGCTTTATCTTGCACAAAAAGATGCTTTATTGCCAGGTCAGCCATTAATTGAACGGTTAGTAAAGAATGGAATTACTCATGTGACTTTGCCGCCTTCAGCTTTAGTGGTTTTACCCCAGGAACCGTTACGCAACTTAGAAACCTTAATTGTGGCGGGTGAGGCTTGTTCTCTTGATTTAGTGAAACAATGGTCAATCGATAGAAACTTTTTCAATGCCTATGGGCCAACGGAAGCGAGTGTTTGTGCCACTATTGGACAATGTTATCAAGATGATTTAAAGGTGACGATTGGTAAGGCGATCGCCAATGTCCAAATTTATATTTTAGATGCCTTTTTACAGCCGGTGCCGGTGGGAGTGTCAGGAGAGTTATACATTGGTGGAGTTGGGGTGGCAAGGGGCTATTTAAATCGTCCTGAATTAACCCAAGAAAAATTTATTGCTAATCCTTTTAGTAACGACCCAGATTCTCGGCTCTATAAAACTGGCGACTTAGCGCGTTATTTACCCGATGGTAATATTGAATATTTAGGACGCATTGACAATCAGGTAAAAATTCGCGGTTTTCGCATTGAGTTAGGAGAAATTGAAGCGGTTCTGAGTCAATGTCCCGATGTGCAAAATACGGCGGTGATTGTCCGCGAAGATACTCCTGGCGATAAGCGCTTAGTTGCCTATGTGGTTCTTACTTCTGACTCCCAGATAACTACTAGCGAACTGCGTCAATTTTTGGCGAATCAATTACCCGCCTATCTTGTTCCTAATACCTTTGTTATTTTAGATGATTTGCCCCTAACCCCCAGTGGCAAATGCGATCGCCGTTCCTTACCTATACCCGAAACACAAGCGTTATCAAATGACTATATTGCCCCTAAATCTCCCACTGAAGAAATTCTGGCTCAAATATGGGGGCAAGTTCTCAAGATAGAAAGAGTCAGCAGAGAAGATAATTTCTTTGAATTGGGGGGGCATTCCCTTTTAGCTACCCAGGTAATGTCCCGTCTGCGTGAAACTTTTCAAGTCGAATTACCTTTGCGTAGTCTCTTTACCGCTCCCACTATTGCTGAATTGGCCCTAACAATTGAGCAATCTCAGCAAACCATTGCTGCTCCCCCCATCCTAACCAGAAACGACAGTGCTAACCTCCCGTTATCTTTTGCTCAACAACGTTTATGGTTTCTGGATCAATTAGAACCTAACAGCGCCTTTTATCATGTAGGGGGAGCCGTAAGACTAGAAGGAACATTAAATATTACTGCCTTAGAGCAAAGCTTAAAAGAAATTATTAATCGTCATGAAGCTTTACGCACAAATTTTATAACGATTGATGGTCAAGCCACTCAAATTATTCACCCTACTATTAATTGGCGATTGTCTGTTGTTGATTGTCAAAATTTAACCGACACTCAATCTCTGGAAATTGCGGAAGCTGAAAAGCCCTTTAATCTTGCTCAAGATTGCTTATTTCGTGCTACTTTATTCGTGCGATCACCGCTAGAATATCATCTACTCGTGACCATGCACCATATTGTTAGCGATGGCTGGTCAATTGGAGTATTTTTTCAAGAACTAACTCATCTTTACGCTGTCTATAATCAGGGTTTACCCTCATCTTTAACGCCTATTAAAATACAATATGCTGATTTTGCGGTCTGGCAACGGAATTGGTTACAAGGTGAAATTTTAAGTAATCAATTGAATTATTGGCGCGAACAATTAGCAAATGCTCCTGCTTTTTTACCTTTACCGACAGATAGACCTAGGCCCGCAATCCAAACTTTTATTGGTTCTCATCAAGAATTTAAACTTTCTCAGCCATTAAGCCAAAAATTGAATCAACTAAGTCAGAAGCATGGAGTGACTTTATTTATGACTCTCCTGGCTGCTTTTGCTACCTTACTTTACCGTTATACAGGACAAGCAGATATTTTAGTTGGTTCTCCTATTGCTAACCGTAATCGTAAGGAAATTGAGGGATTAATCGGCTTTTTTGTTAATACATTAGTTCTGAGATTGAGTTTAGATAATGATTTAAGTTTTCAAAATTTGCTAAACCATGTTAGAGAGGTTTCTTTAGCAGCCTACGCCCATCAAGATTTACCTTTTGAAATGTTAGTAGAAGCACTACACCCTCAACGAGATCTCAGTCATACCCCTTTATTTCAGGTAATGTTTGTTTTGCAAAATACACCAGTGGCTGATCTAGAACTTAAAAATGTAAAGGTTTGTCCTCTACCGATGGAAAATAAGACTGCTAAATTTGATTTAACCTTATCAATGGAGAATCTAGAGGAAGGATTGATTGGGGTTTGGGAATATAACACCGATCTATTTAATGGCTCAACCATTGAGCGAATGAGTGGACATTTTGTCACTTTGTTAGAAGATATTGTTGCCGCTCCAACGAAGTCAGTTTTACGGTTGTCTTTGCTGACGCAAGAGGAAAAACTGCAATTATTGATTAAAAATCAGGGTGTTCAAGTTGATTATTCTCAAGAGCAGTGCATCCATCAATTATTTGAAGCGCAAGTTGAACGGACTCCCGATGCGATTGCGGTGGTATTTGAGGAGCAATCATTAACCTATGCTGAATTAAATCATCAAGCTAATCAGTTAGTCCATTACTTACAAACTTTAGGAATTGGGCCAGAGGTCTTAGTCGCTATTTCAGTAGAACGTTCTTTAGAAATGATTATCGGCTTATTAGCCATTCTCAAGGCGTGTGGTGCTTATCTCCCTCTTGCTCCTGACTATCCCACTGAGCGTCTTCAGTTCATGTTAGAAGATAGTCAAGCTTCTTTTTTGATTACCCACAGTTCTTTATTAGAAAAATTGCCTTCTTCTCAAGCGACTCTAATTTGTTTAGATCACATCCAAGAGCAGATTTCTCAATATTCTCCCGATAATCTTCAAAGTGAGTTAACTCCTTCCAATTTGGCTAACGTTATTTACACCTCTGGCTCTACGGGTAAGCCTAAAGGGGTGATGGTTGAACATCGGGGCTTAGTTAACTTAGCGAGTTCTCAAATTCAATCTTTTGCAGTCAAAAATAACAGTCGTGTACTGCAATTTGCTTCCTTTAGTTTTGATGCTTGTATTTCAGAAATTTTGATGACCTTTGGTTCTGGAGCGACTCTTTATCTTGCTCAAAAAAATGATTTATTGCCAGGTCAGCCATTAATGGAAAGGTTAGAAAAGAATAAAATTACCCATGTTACTTTACCCCCTTCAGCTTTAGCTGTTTTACCAAAAAAACCGTTACCCAACTTACAAACTTTAATTGTGGCGGGTGAGGCTTGTCCTCTGGATTTAGTCAAACAATGGTCAGTCGGTAGAAACTTTTTCAATGCCTATGGCCCGACAGAAACGAGTGTTTGTGCCACGATTGGACAATGTTATCAAGATGATTTAAAGGTCACGATTGGTAAGGCGATCGCTAATGTCCAAATTTATATTTTGGATGCCTTTTTACAACCAGTACCCATCGGAGTACCAGGGGAATTATACATTGGTGGAGTCGGAGTTGCGAGGGGTTATCTAAATCGTCCTGAATTAACGGCGGAAAGATTTATTCCTAATCCTTTTGATCCCCCCCTAACCCCCCTTAAAAAGGGGGGAGATAAGAGCTATGAAACTTTTAAAAAGGGGGAAGAGCAACCATCAAAACTCTATAAAACGGGAGATTTAGCTCGTTATTTACCCGATGGCAATATTGAATATTTAGGACGCATTGACAATCAGGTAAAAATTCGCGGTTTTCGCATTGAGTTAGGAGAAATTGAAGCGGTTCTGAGTCAATGTCCCGATGTGCAAAATACGGCGGTGATTGTCCGTGAAGATACTCCTGGCGATAAACGTTTAGTTGCCTATGTGGTTCTTACTTCTGACTCCCAGATAACTACTAGCGAACTGCGTCAATTCTTGGCTAATCAATTACCTGCCTATCTCGTTCCCAATACCTTTGTTATTTTAGATGATTTGCCCCTAACCCCCAATGGTAAATGCGATCGCCGTTCCTTACCGCTTCCTGATGATCAGACCAGAAAAAATATTCCTAAAATTGGCCCGCGTAATTTAGTGGAATTACAATTAGCTCAAATCTGGTCAGAGATTTTAGGCATTAATAATATTGGTATTCAGGAAAACTTCTTTGAATTAGGCGGTCATTCTTTATTAGCAGTCAGTCTGATCAATCGTATTGAACAAAAGTTAGATAAACGTTTACCATTAACCAGTCTTTTTCAAAATGGAACCATAGCAAGTCTAGCTCAATTACTAGCGCAAGAAACAACTCAGCCAGCCTCTTCACCGTTGATTGCTATCCAGTCTCAAGGTGATAAAACTCCATTTTTTGCTGTTCATCCCATTGGTGGTAATGTGCTATGTTATGCCGATTTAGCTCGTAATTTAGGAACGAAACAGCCGTTTTATGGATTACAATCATTAGGGCTAAGTGAATTAGAAAAAACTGTAGCCTCTATTGAAGAAATGGCGATGATTTATATTGAAGCAATACAAACTGTTCAAGCCTCTGGTCCCTACTATTTAGGAGGTTGGTCAATGGGAGGAGTGATAGCTTTTGAAATCGCCCAACAATTATTGACCCAAGGTCAAGAAGTTGCTTTACTGGCTTTAATAGATAGTTATTCTCCCAGTTTACTTAATTCAGTTAATAGGGAGAAAAATTCTGCTAATTCCCTGACAGAAGAATTTAATGAAGATATCAATATTGCCTATTCTTTCATCAGAGACTTAGCAAGTATATTTAATCAAGAAATCTCTTTCTCTGGGAGTGAACTTGCTCATTTTACATCAGACGAATTACTAGACAAGTTTATTACTTGGAGTCAAGAGACGAATCTTTTGCCGTCAGATTTTGGGAAGCAGCAGGTTAAAACCTGGTTTAAAGTTTTCCAGATTAATCACCAAGCTTTGAGCAGCTATTCTCCCAAGACGTATCTGGGTAGAAGTGTTTTCTTAGGAGCGGAAGACAGTTCTATTAAAAATCCTGGTTGGCATCAAGTAATCAATGACTTGCAATCTCAATGGATTAGCGGCGATCACTACGGTTTAATTAAAAATCCAGTCCTCGCTGAAAAACTCAATAGCTACCTAGCCTAAAACTTTCAAAAAGCCTGATTATTGTTTAAAATGAATGATCGTTCACCGGTCAGAGGACAAGTATGACAACCCAAACAGCTTCTAGTGCCAATGCCCTTGCTTCCTTTAACCAATTTTTAAGGGATGTAAAGGCGATCGCCCAACCCTATTGGTATCCCACTGTATCAAATAAAAGAAGCTTTTCTGAGGTTATTCGTTCCTGGGGAATGCTATCACTGCTTATCTTTTTGATTGTGGGATTAGTCGCCGTCACGGCTTTTAATAGTTTTGTTAATCGTCGTTTAATTGATGTCATTATTCAAGAAAAAGATGCGTCTCAATTTGCCAGTACATTAACTGTCTATGCGATCGGATTAATCTGTGTAACGCTGCTGGCAGGGTTCACTAAAGATATTCGCAAAAAAATTGCCCTAGATTGGTATCAATGGTTAAACACCCAGATTGTAGAGAAATATTTTAGTAATCGTGCCTATTATAAAATTAACTTTCAATCTGACATTGATAACCCCGATCAACGTCTAGCCCAGGAAATTGAACCGATCGCCACAAACGCCATTAGTTTCTCGGCCACTTTTTTGGAAAAAAGTTTGGAAATGCTAACTTTTTTAGTGGTAGTTTGGTCAATTTCTCGACAGATTGCTATTCCGCTAATGTTTTACACGATTATCGGTAATTTTATTGCCGCCTATCTAAATCAAGAATTAAGCAAGATCAATCAGGCACAACTGCAATCAAAAGCAGATTATAACTATGCCTTAACCCATGTTCGGACTCATGCGGAATCTATTGCTTTTTTTCGGGGAGAAAAAGAGGAACAAAATATTATTCAGCGACGTTTTCAGGAAGTTATCAATGATACGAAAAATAAAATTAACTGGGAAAAAGGGAATGAAATTTTTAGTCGGGGCTATCGTTCCGTCATTCAGTTTTTTCCTTTTTTAGTCCTTGGCCCTTTGTATATTAAAGGAGAAATTGATTATGGACAAGTTGAGCAAGCTTCATTAGCTAGTTTTATGTTTGCATCGGCCCTGGGAGAATTAATTACAGAATTTGGTACTTCAGGACGTTTTTCTAGTTATGTAGAACGTTTAAATGAATTTTCTAATGCCTTAGAAACTGTGACTAAACAAGCCGAGAATGTCAGCACAATTACAACCATAGAAGAAAATCATTTTGCCTTTGAACACGTCACCCTAGAAACCCCTGACTATGAAAAGGTGATTGTTGAGGATTTATCTCTTACTGTTCAAAAAGGTGAAGGATTATTGATTGTCGGGCCCAGTGGTCGAGGTAAAAGTTCTTTATTAAGGGCGATCGCCGGTTTATGGAATGCTGGCACTGGGCGTTTAGTGCGTCCTCCCCTAGAAGAAATTCTCTTTTTGCCCCAACGTCCCTACATTATTTTGGGAACCTTACGCGAACAATTGCTGTATCCTCTAACCAATAGTGAGATGAGCAATACCGAACTTCAAGCAGTATTACAACAAGTCAATTTGCAAAATGTGCTAAATCGGGTGGATGACTTTGACTCCGAAAAACCCTGGGAAAACATTCTCTCCCTCGGTGAACAACAACGCCTAGCCTTTGCTCGATTGTTAGTGAATTCTCCGAGTTTTACCATTTTAGATGAGGCGACCAGTGCCTTAGATTTAACAAATGAGGGGATTTTATACGAGCAATTACAAACTCGCAAGACAACCTTTATTAGTGTGGGTCATCGAGAAAGTTTGTTTAATTACCATCAATGGGTTTTAGAACTTTCTGCTGACTCTAGTTGGGAACTCTTAAGCGTTCAAGATTATCGCCTTAAAAAAGCGGGAGAAATGTTTACTAATGCTTCGAGTAACAATTCCATAACACCCGATATTACTATCGATAATGGATCAGAACCAGAAATAGTCTATTCTCTTGAAGGATTTTCCCATCAGGAAATGAAACTATTAACAGACCTATCACTCTCTAGCATTCGGAGTAAAGCCAGTCGAGGGAAGGTGATTACAGCCAAGGATGGTTTTACCTACCTTTATGACAAAAATCCTCAGATATTAAAGTGGCTCAGAACTTAA

1-19. (canceled)
 20. A nucleic acid encoding a peptide spacer sequence(SP), wherein a. the peptide sequence comprises at least 4 glycine aminoacids per single repeat unit (SRU), or b. at least five proline and/orleucine amino acids per single repeat unit (SRU), c. a SRU within the SPis between 7 and 15 amino acids in length, and d. the SP comprisesbetween 2 and 10 SRUs.
 21. The nucleic acid according to claim 20,encoding a peptide SRU with a sequence as shown in SEQ ID NO. 20 or SEQID NO.
 21. 22. The nucleic acid according to claim 21, with a sequenceas shown in SEQ ID NO. 43 or SEQ ID NO. 44.